Multimodal Retrieval-Augmented Generation for Autonomous Driving Scenario Generation
- Institut
- Professur für autonome Fahrzeugsysteme
- Typ
- Semesterarbeit Masterarbeit
- Inhalt
- experimentell theoretisch
- Beschreibung
The safe deployment of autonomous vehicles requires diverse and realistic traffic scenarios for development and testing.
Existing scenario libraries mainly support text-based retrieval, which limits queries like
“show me a high-speed merge at night similar to this picture” or
“find a scenario with two interacting agents and generate a variant with heavier traffic.”Recent progress in multimodal retrieval-augmented generation (RAG)—combining visual and textual embeddings—makes it possible to jointly index text and images.
Such a system can retrieve and synthesize scenarios directly from natural-language or visual queries, providing a powerful foundation for scenario generation and validation in autonomous driving.
Objective
Develop and evaluate a multimodal RAG framework that
-
supports Text → Image, Image → Scenario, and Hybrid retrieval,
-
enables a chatbot interface to retrieve and generate new scenarios grounded in retrieved XML constraints, and
-
provides rigorous evaluation of retrieval quality, generation fidelity, and robustness to noisy or ambiguous queries.
We Offer
-
A dynamic, future-oriented research environment.
-
Hands-on experience with a state-of-the-art multimodal RAG stack (FAISS/Weaviate/Pinecone, CLIP/OpenCLIP, BLIP, DuckDB/Parquet, LangChain/LlamaIndex).
-
Opportunity to publish a scientific paper (based on merit).
-
Thesis can be written in English or German.
Requirements
-
Initiative and a creative, problem-solving mindset.
-
Excellent English or German proficiency.
-
Strong python skills; familiarity with Pytorch and basic computer-vision/ML.
-
Experience with at least one of: vector databases, information retrieval, VLMs/LLMs, or autonomous-driving data (e.g., CommonRoad).
-
Familiarity with common development tools (Git, Ubuntu).
-
(Nice to have) Experience with CARLA/MetaDrive , or scenario standards (CommonRoad/OpenSCENARIO).
Start
Work can begin immediately. If you are interested in this topic, please first have a look at our recent survey paper: https://arxiv.org/abs/2506.11526
Then send a brief cover letter explaining why you are fascinated by this subject, along with a current transcript of records and your CV to: yuan_avs.gao@tum.de
-
- Tags
- AVS Gao
- Möglicher Beginn
- sofort
- Kontakt
-
Yuan Gao
yuan_avs.gaotum.de