I do robotics. Let robots learn.

Hi, there. I'm Zhi Wang. I go by Leo. I'm a first-year CS PhD Student in Robotics🤖 at University of Maryland, College Park, advised by Prof. Yiannis Aloimonos. I received my Bachlor's degree in Electronic Engineering at Tsinghua University.

I'm also a research intern at Amazon FAR (Frontier AI & Robotics), working with Prof. Guanya Shi, Prof. Carmelo Sferrazza, Dr. Rocky Duan, and Prof. Pieter Abbeel.

Previously, I was fortunate to work with Prof. Wenzhen Yuan at UIUC CS, Dr. Shaohan Huang at Microsoft Research and Prof. Jianyu Chen at Tsinghua IIIS.

I'm doing efficient robot learning from human data.

Goal: Let robots learn from multimodal information, like human videos and haptics, to achieve fine-grained and general-purpose robotic manipulation.

CV / GitHub / LinkedIn / X /
Instagram / WeChat / RedNote

profile photo Zhi (Leo) Wang
Robotics PhD @ UMD
Research Intern@Amazon FAR
Email: tx.leo.wz@gmail.com
Updates
Publications
HumanEgo animation
HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos
Zhi Wang, Botao He, Kelin Yu, Seungjae Lee, Ruohan Gao, Furong Huang, Yiannis Aloimonos
Website / Paper / Video / Code
We introduce HumanEgo, a robot-data-free, hardware-agnostic, and data-efficient pipeline that learns robot manipulation policies from minutes of raw human egocentric videos—powered by a flow matching policy with dense auxiliary objectives.


DoorBot animation
DoorBot: Closed-Loop Task Planning and Manipulation for Door Opening in the Wild with Haptic Feedback
Zhi Wang*, Yuchen Mo*, Shengmiao Jin, Wenzhen Yuan
IEEE International Conference on Robotics and Automation (ICRA), 2025
Website / Paper / Video / Code / UIUC Summary / Dataset
Proposed DoorBot, a haptic-aware closed-loop hierarchical control framework that enables robots to explore and open different unseen doors in the wild. We test our system on 20 unseen doors across different buildings, featuring diverse appearances and mechanical types. Our framework achieves a 90% success rate, demonstrating its ability to generalize and robustly handle varied door-opening tasks.


KOSMOS-E diagram
KOSMOS-E: Learning to Follow Instruction for Robotic Grasping
Zhi Wang*, Xun Wu*, Shaohan Huang, Li Dong, Wenhui Wang, Shuming Ma, Furu Wei
IEEE International Conference on Intelligent Robots and System (IROS), 2024, Oral Pitch
Website / Paper / Video / Code /
Proposed KOSMOS-E, a Multimodal Large Language Model (MLLM) that leverages instruction-following robotic grasping data to enhance capabilities for precise and intricate robotic grasping maneuvers.
Education
Experiences
Leaderships & Activities

Chair of the Electronic Engineering Hardware Group, Tsinghua University2021 - 2023

  • 30-Person Team, Tsinghua University [Website]
  • Organized two major annual, university-wide competitions, engaging over 450 participants.

Leader of Hardware and Vision Team in Future Robot Club (FuRoC), Tsinghua University2021 - 2023

  • 15-Person Team, Tsinghua University [Website] [GitHub]
  • Led the team of Tinker, a domestic service robot, participating in annual RoboCup@Home Competition.
Teaching Experience
Honors & Awards
MISC