Embodied AI and LLM Agents for Medical Imaging and Robotic Systems
- Institute
- Chair for Computer Aided Medical Procedures & Augmented Reality (TUM-CIT)
- Type
- Bachelor's Thesis Semester Thesis Master's Thesis
- Content
- experimental theoretical constructive
- Description
Embodied AI and LLM Agents for Medical Imaging and Robotic Systems
Exploring LLM agents, multimodal perception, and intelligent robotic systems for next-generation healthcare.
Are you interested in medical imaging AI, robotics, or LLM/VLM-based intelligent systems?
Are you hoping to explore a potential PhD path, strengthen your research profile, or work toward academic publication?Join our group for exciting research opportunities in medical imaging, multimodal AI, and embodied intelligent robotic systems. Our projects combine cutting-edge methods with real-world healthcare applications, offering students the chance to work on impactful problems, develop strong research skills, and contribute to systems that can perceive, reason, and act in complex medical scenarios.
Open Thesis Topics:
Project 1: LLM Agent for Robotic Ultrasound Scanning Assistance
This project builds on our existing robotic ultrasound intelligent system (LLM Agent).
The current system is capable of:
- autonomous scanning based on ultrasound guidance and current measurements,
- selecting probe positions autonomously via tool calling,
- planning scanning trajectories,
- generating ultrasound reports and optimizing operations.
The next stage of the project is to further improve the system’s robustness and memory capabilities, enabling it to handle more complex and realistic clinical scenarios.
Contact: yuan.bi@tum.de
Project 2: Real-Time Ultrasound Understanding with MLLM
This project is based on our existing work in real-time video understanding multimodal large models.
Our lab already has:
- strong experience in general video understanding,
- a solid research roadmap for adapting these methods to the special characteristics of medical ultrasound imaging.
The next steps include:
- expanding public ultrasound video and text datasets,
- transferring general video understanding capabilities to the ultrasound domain,
- building an intelligent system capable of understanding and explaining ultrasound image content.
Contact: yue.zhou@tum.de
What You Will Gain
In these projects, you will have the opportunity to participate in:
- cutting-edge research at the intersection of medical imaging × robotics × multimodal AI,
- LLM Agent system development and decision-making pipelines,
- real-world deployment and validation of robotics systems,
- research on video understanding multimodal large models (MLLM).
- Requirements
Requirements
For both projects:
- solid background in deep learning and related theory
- strong coding skills and research interest
- interest in medical imaging, AI, and multimodal learning
- background in CS / EE / ME or related fields
Additional requirement for the robotics project (Project 1) only
- familiarity with robotics, such as robot kinematics, control, ROS, or related experience
No robotics background is required for the multimodal medical imaging project (Project 2).
Preferred
- experience with robotics experiments, especially if you can come to campus frequently,
strong research motivation and interest in contributing to publications.
Bonus Skills
Experience in any of the following is a plus:
- KUKA / Franka / ROS robotics experience,
- LLM / RAG / tool calling / RL / medical imaging experience,
- Prior MLLM project experience, or experience with data collection and processing.
How to Apply
If you would like to do your MA / SA / Guided Research in areas such as:
- medical image + robotics + LLM, or
- LLM applied to text, image, and video understanding,
You are warmly welcome to contact us by email with your CV or transcript.
- Possible start
- sofort
- Contact
-
Xuesong Li
Xuesong.Litum.de