Multimodal Large Language Models for Autonomous Vehicles Awareness

Institut
Lehrstuhl für Fahrzeugtechnik
Typ
Semesterarbeit / Masterarbeit /
Inhalt
experimentell / theoretisch /  
Beschreibung

Traditional algorithms in autonomous vehicles are designed to handle predefined scenarios. However, real-world driving often presents novel and unexpected situations.

Deploying a Multimodal Large Language Model (MLLM) to enhance the awareness of autonomous vehicles about their environment has the potential to revolutionize the field of autonomous driving. By combining cutting-edge natural language processing with multimodal perception capabilities, this integration offers numerous benefits, significantly improving the available algorithms and making autonomous vehicles more context-aware and adaptive.

MLLMs can generate dynamic and contextually-aware responses based on the sensor inputs received by the vehicle. This adaptability is crucial for autonomous vehicles to navigate through various scenarios, such as changing weather conditions, construction zones, or unexpected obstacles. The real-time analysis of multimodal data by MLLMs enables the vehicle to make informed decisions and respond proactively to the environment's nuances.

The goal of this thesis is to improve the awareness of state-of-the-art autonomous driving software, by adapting it using MLLMs.

Tasks to be completed during the thesis:

  • Conduct a state-of-the-art review on the use of MLLMs
  • Conduct a state-of-the-art review on available open-source pre-trained MLLM s
  • Design a multimodal fusion mechanism to integrate information from different sensors and data modalities
  • Fine-tuning on specific autonomous driving algorithm tasks to improve their performance in the chosen context based on available data
  • Integrate the Multimodal Large Language Model (MLLM) into a given simulation framework
  • Test and evaluate the performance of the proposed approach in simulation and/or experiment
  • Compare the performance of the proposed approach with traditional methods
  • Write a comprehensive thesis report documenting the research, methodology, implementation, and results.
Voraussetzungen
  • Motivation to familiarize yourself with new topics and to try new ideas
  • Ideally previous theoretical knowledge in Model Predictive Control
  • Ideally previous experience with Python and Git
Verwendete Technologien
Python, C++, Programming, Autonomous Driving, Machine Learning, Reinforcement Learning, Deep Learning, Model Predictive Control, MPC, Large Language Model, LLM, MLLM, multimodal language model,Parameter Estimation, Heuristics, motion control
Tags
FTM Studienarbeit, FTM AV, FTM Zarrouki, FTM Informatik, FTM AV Perception
Möglicher Beginn
sofort
Kontakt
Baha Zarrouki, M.Sc.
Raum: MW3527
Tel.: +49 (89) 289 - 10498
baha.zarroukitum.de
Ausschreibung