Foundation Model-Assisted Auto-Labeling Pipeline for Front-View Parking Slot Detection

Institut
Lehrstuhl für Fahrzeugtechnik (TUM-ED)
Typ
Semesterarbeit / Masterarbeit /
Inhalt
experimentell /  
Beschreibung

Motivation

In modern autonomous driving, reliable parking slot detection is a critical structural element. However, unlike standard objects (e.g., vehicles, pedestrians) that can be easily bounded by simple boxes, parking environments are highly complex and unstructured. Parking slots are often defined by ambiguous markings and diverse obstacles rather than distinct physical volumes, requiring precise 4-corner polygon annotations on the image plane. Acquiring such dense geometric annotations manually is highly expensive and poorly scalable. Furthermore, current approaches heavily rely on fisheye cameras, limiting the use of standard front-view and LiDAR sensor setups common in large-scale driving data. This work addresses aforementioned limitations by developing an automated annotation pipeline that leverages foundation models and multi sensor (camera and LiDAR) geometry reasoning. The core objective is to generate highly accurate parking slot labels without relying on massive human annotation. By building LiDAR-aligned maps and identifying parked vehicles via visual foundation models, the system derives structural parking labels directly from front-view setups.

Voraussetzungen

Work Packages

  • Literature review: foundation models for unstructured object detection, and auto-labeling.
  • Implementation:
    • Synthetic dataset creation: Using CARLA, implement an entire pipeline that generates virtual front-view image, point cloud, and parking spot labels.
    • Auto-Labeling pipeline development: incorporate geometric characteristics, Lidar-based map generation, and knowledge from foundation models.
    • [Optional] Extension to self-training loop: Develop a method to fully exploit the pseudo-generated labels to reach best parking-spot detection possible.
  • Evaluation: Validating label correctness and efficacy of generated labels in perception model training in both simulated environments (CARLA) and real-world datasets.

What you should bring along?

  • Very good programming skills in Python and PyTorch.
  • Knowledge of Computer Vision and Deep Learning (Must), Foundation models such as VLMs, SAM (Desired).
  • High personal motivation and independent working style.
  • Very good language proficiency in English.

 

Possibility for publication in case of excellent work.

 

If you are interested, please send me a grade sheet, your CV, and short introduction (~5 sentences why this topic is interesting to you)!

Tags
FTM Studienarbeit, FTM AV, FTM AV Perception, FTM Lim, FTM Informatik, FTM IDP
Möglicher Beginn
sofort
Kontakt
Hojun Lim, M.Sc.
hojun.limtum.de
Ausschreibung