BaSaMa & HiWi

Foundation Model-Assisted Auto-Labeling Pipeline for Front-View Parking Slot Detection

Institut

Lehrstuhl für Fahrzeugtechnik (TUM-ED)

Typ

Semesterarbeit / Masterarbeit /

Inhalt

experimentell /

Beschreibung

Motivation

In modern autonomous driving, reliable parking slot detection is a critical structural element. However, unlike standard objects (e.g., vehicles, pedestrians) that can be easily bounded by simple boxes, parking environments are highly complex and unstructured. Parking slots are often defined by ambiguous markings and diverse obstacles rather than distinct physical volumes, requiring precise 4-corner polygon annotations on the image plane. Acquiring such dense geometric annotations manually is highly expensive and poorly scalable. Furthermore, current approaches heavily rely on fisheye cameras, limiting the use of standard front-view and LiDAR sensor setups common in large-scale driving data. This work addresses aforementioned limitations by developing an automated annotation pipeline that leverages foundation models and multi sensor (camera and LiDAR) geometry reasoning. The core objective is to generate highly accurate parking slot labels without relying on massive human annotation. By building LiDAR-aligned maps and identifying parked vehicles via visual foundation models, the system derives structural parking labels directly from front-view setups.

Voraussetzungen

Work Packages

Literature review: foundation models for unstructured object detection, and auto-labeling.
Implementation:
- Synthetic dataset creation: Using CARLA, implement an entire pipeline that generates virtual front-view image, point cloud, and parking spot labels.
- Auto-Labeling pipeline development: incorporate geometric characteristics, Lidar-based map generation, and knowledge from foundation models such as VLMs.
- [Optional] Extension to self-training loop: Develop a method to fully exploit the pseudo-generated labels to reach best parking-spot detection possible.
Evaluation: Validating label correctness and efficacy of generated labels in perception model training in both simulated environments (CARLA) and real-world datasets.

What you should bring along?

Very good programming skills in Python and PyTorch.
Knowledge of Computer Vision and Deep Learning (Must), Foundation models such as VLMs, SAM (Desired).
High personal motivation and independent working style.
Very good language proficiency in English.

Possibility for publication in case of excellent work.

If you are interested, please send me a grade sheet, your CV, short introduction (~5 sentences why this topic is interesting to you), and earliest possible date!

Tags

FTM Studienarbeit, FTM AV, FTM AV Perception, FTM Lim, FTM Informatik, FTM IDP

Möglicher Beginn

sofort

Kontakt

Hojun Lim, M.Sc.
hojun.limtum.de

Ausschreibung

Navigation

Navigation

Foundation Model-Assisted Auto-Labeling Pipeline for Front-View Parking Slot Detection

Motivation