Fine-tuning Visual Geometry Grounded Transformer for Enhanced Autonomous Driving Perception

Institut
Professur für autonome Fahrzeugsysteme
Typ
Semesterarbeit / Masterarbeit /
Inhalt
 
Beschreibung

Modern autonomous vehicles use end-to-end learning to map camera inputs to driving decisions. However, these systems often lack explicit 3D scene understanding, such as road layouts and obstacles, limiting interpretability and robustness. The Visual Geometry Grounded Transformer (VGGT) has demonstrated state-of-the-art performance in multi-view 3D reconstruction tasks, including camera parameter estimation, depth mapping, and dense point cloud generation. Despite its strengths, VGGT exhibits limitations such as rendering sky regions accurately, which is needed for intensive post-processing for cleaning.

To address this, we propose fine-tuning VGGT on datasets tailored for autonomous driving, such as NuScenes and synthetic environments generated by CARLA. By incorporating diverse weather conditions, lighting variations, and complex urban scenarios, we aim to enhance VGGT's ability to reconstruct 3D scenes with accurate representations.

Project Objective:

This project aims to adapt VGGT for autonomous driving by addressing its limitations in sky rendering. Core components include:

  • Fine-tuning VGGT on NuScenes (or collect data from CARLA simulator) to improve sky region reconstruction
  • Adjust the model structure if necessary
  • Comparing the fine-tuned VGGT model with the original one

 

We Offer:

  • Engagement in cutting-edge research at the intersection of computer vision and autonomous driving

  • Integration with an existing end-to-end driving framework

  • Opportunities to publish findings at top-tier AI and robotics conferences

  • An English-speaking research environment with potential for thesis supervision


Your Qualifications:

  • Proficiency in Python and PyTorch

  • Strong interest in autonomous driving and deep learning

  • Background in computer vision or 3D reconstruction is advantageous


Start date is flexible. If interested, please send:

  • A 200-word description of your computer science coursework performance (campus or online)

  • Academic transcript (optional for now, may be required later)

  • CV (optional)

to dingrui.wangtum.de.


Möglicher Beginn
sofort
Kontakt
Dingrui Wang
dingrui.wangtum.de