LUMO

Lifelong Multimodal Language Learning by Explaining and Exploiting Compositional Knowledge

Duration

36 months

July 2025 – June 2028

Host Institution

University of Hamburg

Department of Informatics

Focus

Robust multimodal lifelong learning

Vision-language and robot learning

Why LUMO?

LUMO studies how multimodal AI systems can remain reliable when tasks change. The project focuses on lifelong learning across vision-language understanding and language-conditioned robotic manipulation.

We combine controlled benchmarks, concept-based explainability, neuro-symbolic learning, and sim2real experiments to understand how compositional knowledge is formed, preserved, and reused.

Read the full project outline →

Research Objectives

O1 – Datasets & Environments

Build diagnostic datasets and simulation environments for compositional vision-language learning and language-conditioned robotic manipulation.

O2 – Concept-Based Explanations

Explain how concepts and relations emerge during continual learning using concept-based XAI and training-dynamics analysis.

O3 – Neuro-Symbolic Integration

Inject symbolic constraints into embedding spaces to improve retention, compositionality, and robust transfer across tasks.

O4 – Sim2Real Transfer

Transfer insights from simulation to the NICO and NICOL robots and compare concept representations across simulated and real settings.