DoorBot

Closed-Loop Task Planning and Manipulation
for Door Opening in the Wild
with Haptic Feedback

Zhi Wang^{* 1,2}, Yuchen Mo^{* 1}, Shengmiao Jin¹, Wenzhen Yuan¹,

¹University of Illinois at Urbana-Champaign, ²Tsinghua University

IEEE international Conference on Robotics and Automation (ICRA), 2025

Paper

arXiv

Video Code

Paper ArXiv UIUC Summary Github Video Dataset

Overview

Proposed DoorBot, a haptic-aware closed-loop hierarchical control framework that enables robots to explore and open different unseen doors in the wild. We test our system on 20 unseen doors across different buildings, featuring diverse appearances and mechanical types. Our framework achieves a 90% success rate, demonstrating its ability to generalize and robustly handle varied door-opening tasks.

Door-Opening System

Visual Appearance

Visual appearance and configurations of our bimanual mobile robot.

System Architecture

System Architecture of DoorBot.

Method: Primitives Design

We propose a hierarchical closed-loop controller to help a mobile robot automatically open various doors and walk through them in open environments. Our method can robustly generalize to different handles in the wild.

We design six motion primitives based on the key steps of opening doors and implement them through low-level controllers. This reduces the dimensionality of the action space and avoids reliance on extensive human expert data.

Method: Grasping-and-Unlocking Model (GUM)

GUM refines the model-based grasp pose prior for the grasp primitive, and simultaneously predicts the motion trajectory for unlocking. It takes RGB and mask images of the handle as input and outputs the adjusted grasp offset (dx, dy) and the unlock axis direction (R). The model is trained on a combination of internet data and real-world data. This allows it to generalize effectively to unseen scenarios.

Comparison of System Performance with and without GUM

Examples of how GUM fixes bad grasp pose during our field test.

Method: Closed-Loop System with Haptic Feedback

Haptic feedback in 3 motion primitives. For unlock-lever and unlock-knob, the current threshold for the elbow joint tells the robot when to stop. For open the increase/decrease of current feedback on the elbow joint shows the push-/pull-type of the door.

With multi-modal feedback, our system can open the cabinet with an unknown unlocking direction via explore-and-adapt.

Main Result

Field test setting. We experimented with 20 environments on the university campus. These scenes in the wild contain various door appearances, handle types, physical properties, and visual distractions like illumination. None of these scenes have been seen in our training dataset.

Method		Type				Avg
High-Level	Low-Level	Lever	Knob	Crossbar	Cabinet	Avg
VLM	VLM	7/25	13/25	7/25	23/25	50%
Ours	VLM	7/25	19/25	8/25	23/25	57%
VLM	Ours	22/25	23/25	16/25	25/25	86%
Ours	Ours	23/25	23/25	19/25	25/25	90%

Our method consistently outperforms other combinations, showing an average success rate improvement from 50% to 90% across all manipulation tasks. None of the doors nor the handles are seen in the training set, proving our model's generalizability across different situations.