Multi-View Neural Sensor Compression for Autonomous Vehicles

Institut
Lehrstuhl für Fahrzeugtechnik
Typ
Semesterarbeit / Masterarbeit /
Inhalt
 
Beschreibung

Introduction

Autonomous vehicles utilize multiple synchronized cameras to achieve comprehensive environmental perception. Current neural video compression approaches process each
camera stream independently, failing to exploit the substantial redundancy present when multiple cameras observe the same 3D scene structure.

The fixed geometric relationships between cameras in autonomous vehicle configurations provide exploitable priors that remain unused in existing neural compression architectures. Modern range-conditional compression models have demonstrated superior rate-distortion performance for single
camera systems by leveraging depth information, but no work has extended this approach to multi-camera systems where geometric correspondence across views provides additional compression opportunities.


The goal of this work is to develop a multi-view neural video compression system that jointly encodes multiple camera streams by exploiting shared scene structure through range conditioning and cross-view feature dependencies, demonstrating superior compression efficiency compared to independent per-stream encoding.

Work Packages:
- Literature survey of multi-view video compression and range-conditional neural codecs
- Extension of existing single-camera range-conditional codec to multi-camera architecture
with joint latent space
- Development of computationally efficient temporal module handling multiple synchronized
video streams
- Implementation of per-camera depth conditioning with structural encoding of camera
extrinsic and geometric relationships
- Design of cross-camera conditional entropy coding where side cameras condition on center
camera transmitted information
- Comparative evaluation measuring compression gains versus independent encoding
baseline on autonomous driving datasets

Recommended Literature

1. Nonlinear Transform Coding
2. End-to-End Neural Video Compression: A Review
3. DCVC-RT
4. Low-Latency Neural Stereo Streaming
5. Neural Stereo Video Compression with Hybrid Disparity Compensation
6. LMVC: An End-To-End Learned Multiview Video Coding Framework

If you are interested or have any questions, please send me an e-mail (niklas.krauss@tum.de) with your CV and a
current transcript of your records, thank you!

Voraussetzungen

Requirements:
- Programming experience with Python and well versed with Pytorch
- Understanding of multi-view geometry and camera calibration
- High personal motivation and independent working style.
- Very good language proficiency in German, English

Verwendete Technologien
Neural Data Compression, Python, Pytorch, Machine Learning, Machine Learning, Super Resolution, Compression, Autonomous Driving, Autonomous Vehicles, Teleoperation
Tags
FTM Studienarbeit, FTM Krauss, FTM AV, FTM AV Safe Operation, FTM Informatik, FTM Teleoperation
Möglicher Beginn
sofort
Kontakt
Niklas Krauß
Raum: 3507
Tel.: +49172 1736882
niklas.krausstum.de
Ausschreibung