Benchmarking Vision-Language Models for Autonomous Driving: Model Evaluation Pipeline

Institute
Professur für autonome Fahrzeugsysteme
Type
Bachelor's Thesis / Semester Thesis /
Content
experimental /  
Description

You will conduct your research at AVS!

Vision-Language Models (VLMs) like GPT-4V, Qwen3-VL, Gemini and co are increasingly being explored for autonomous driving applications. However, there is currently no systematic benchmark to evaluate whether these models actually understand vehicle dynamics and can assess trajectory quality. This is an important safety-critical capability for planning in AD.

Within this research project, you will develop the evaluation pipeline to systematically benchmark state-of-the-art VLMs on physical reasoning tasks for autonomous driving. You will work with both commercial APIs and open-source models across multiple hardware platforms.

Your tasks will include:

  • Implementing API wrappers for commercial VLMs (GPT-4V, Gemini, Claude)
  • Setting up inference pipelines for open-source models (Qwen3-VL, InternVL, Llama-Vision) on GPU workstations
  • Developing a standardized evaluation framework for consistent model comparison
  • Implementing metrics for classification accuracy, temporal consistency, and explanation quality
  • Automating large-scale experiment execution and results aggregation

With your work you will actively contribute to a conference publication.

Requirements
  • Python programming skills
  • Experience with API integration and REST services
  • Familiarity with deep learning frameworks (PyTorch, HuggingFace)
  • Interest in autonomous driving and machine learning
  • Organized and independent work attitude

A plus:

  • Prior project work in robotics or autonomous systems
  • Experience in Formula Student or similar project-based teams

Your Benefits:

  • Future-oriented field of research
  • Young and dynamic team
  • Academic and professional support
  • Organized and structured project
  • Direct contribution to a scientific publication aimed to be published at a top robotics conference
  • Project work in English or German
Tags
AVS Schaefer
Possible start
sofort
Contact
Finn Rasmus Schäfer
finn.schaefertum.de