Multimodal Retrieval-Augmented Generation for Autonomous Driving Scenario Generation

Institut
Professur für autonome Fahrzeugsysteme
Typ
Semesterarbeit / Masterarbeit /
Inhalt
experimentell / theoretisch /  
Beschreibung

The safe deployment of autonomous vehicles requires diverse and realistic traffic scenarios for development and testing.
Existing scenario libraries mainly support text-based retrieval, which limits queries like
“show me a high-speed merge at night similar to this picture” or
“find a scenario with two interacting agents and generate a variant with heavier traffic.”

Recent progress in multimodal retrieval-augmented generation (RAG)—combining visual and textual embeddings—makes it possible to jointly index text and images.
Such a system can retrieve and synthesize scenarios directly from natural-language or visual queries, providing a powerful foundation for scenario generation and validation in autonomous driving.


Objective

Develop and evaluate a multimodal RAG framework that

  1. supports Text → Image, Image → Scenario, and Hybrid retrieval,

  2. enables a chatbot interface to retrieve and generate new scenarios grounded in retrieved XML constraints, and

  3. provides rigorous evaluation of retrieval quality, generation fidelity, and robustness to noisy or ambiguous queries.


We Offer

  • A dynamic, future-oriented research environment.

  • Hands-on experience with a state-of-the-art multimodal RAG stack (FAISS/Weaviate/Pinecone, CLIP/OpenCLIP, BLIP, DuckDB/Parquet, LangChain/LlamaIndex).

  • Opportunity to publish a scientific paper (based on merit).

  • Thesis can be written in English or German.


Requirements

  • Initiative and a creative, problem-solving mindset.

  • Excellent English or German proficiency.

  • Strong python skills; familiarity with Pytorch and basic computer-vision/ML.

  • Experience with at least one of: vector databases, information retrieval, VLMs/LLMs, or autonomous-driving data (e.g., CommonRoad).

  • Familiarity with common development tools (Git, Ubuntu).

  • (Nice to have) Experience with  CARLA/MetaDrive , or scenario standards (CommonRoad/OpenSCENARIO).

 

Start

Work can begin immediately. If you are interested in this topic, please first have a look at our recent survey paper: https://arxiv.org/abs/2506.11526

Then send a brief cover letter explaining why you are fascinated by this subject, along with a current transcript of records and your CV to: yuan_avs.gao@tum.de

Tags
AVS Gao
Möglicher Beginn
sofort
Kontakt
Yuan Gao
yuan_avs.gaotum.de