Close the Optical Sensing Domain Gap by Physics-Grounded Active Stereo Sensor Simulation

IEEE Transactions on Robotics (T-RO)

1University of California, San Diego, 2Tsinghua University

Off-the-Shelf Real-Time Sensor Simulation

The pipeline in this paper has been integrated into the open-source library SAPIEN. SAPIEN is a high-performance interactive simulation environment. It enables various robotic vision and interaction tasks that require detailed part-level understanding. With SAPIEN, you can easily build the scene, set up the renderer, and accomplish the sensor simulation in real time (60+ FPS). All of these are as simple as a few lines of python code:

import sapien.core as sapien
from sapien.sensor import StereoDepthSensor, StereoDepthSensorConfig

# Set up simulation engine and renderer
sim = sapien.Engine()
renderer = sapien.SapienRenderer()

# Set up scene
scene = sim.create_scene()

# Set up sensor
sensor_config = StereoDepthSensorConfig()
sensor = StereoDepthSensor('sensor', scene, sensor_config)

# Realistic depth simulation

# Get depth map as PyTorch GPU Tensor
depth_dl = sensor.get_depth_dl_tensor()
depth_tensor = torch.utils.dlpack.from_dlpack(depth_dl).clone()

# Get RGB point cloud as PyTorch GPU Tensor
rgb_pc_dl = sensor.get_pointcloud_dl_tensor(with_rgb=True)
rgb_pc_tensor = torch.utils.dlpack.from_dlpack(rgb_pc_dl).clone()
Modules are highly configurable so that the pipeline can easily adapt to different real-world sensors. For more information please refer to the tutorial linked at the top of the webpage.


In this paper, we focus on the simulation of active stereovision depth sensors, which are popular in both academic and industry communities. Inspired by the underlying mechanism of the sensors, we designed a fully physics-grounded simulation pipeline that includes material acquisition, ray-tracingbased infrared (IR) image rendering, IR noise simulation, and depth estimation. The pipeline is able to generate depth maps with material-dependent error patterns similar to a real depth sensor in real time. We conduct real experiments to show that perception algorithms and reinforcement learning policies trained in our simulation platform could transfer well to the realworld test cases without any fine-tuning. Furthermore, due to the high degree of realism of this simulation, our depth sensor simulator can be used as a convenient testbed to evaluate the algorithm performance in the real world, which will largely reduce the human effort in developing robotic algorithms. The entire pipeline has been integrated into the SAPIEN simulator and is open-sourced to promote the research of vision and robotics communities.

Pipeline Overview


RGB, infrared, and depth from RealSense D415 (left) v.s. RGB, infrared, and depth generated by our method (right). Note that our method can simulate the error pattern of the real depth sensor. We found the existence of such patterns vital to enhancing sim-to-real performance.

Qualitative comparison of 6D object pose estimation algorithms on real depth images. The scene is challenging for pose estimation as the depth measurement of real objects (Golden ball, S.Pellegrino) is noisy and incomplete. All the three pose estimation algorithms are able to infer accurate poses while trained solely on the simulated data generated by our method. Note that we use depth maps for the pose estimation and RGB images are only used for better visualization.


  author={Zhang, Xiaoshuai and Chen, Rui and Li, Ang and Xiang, Fanbo and Qin, Yuzhe and Gu, Jiayuan and Ling, Zhan and Liu, Minghua and Zeng, Peiyu and Han, Songfang and Huang, Zhiao and Mu, Tongzhou and Xu, Jing and Su, Hao},
  journal={IEEE Transactions on Robotics}, 
  title={Close the Optical Sensing Domain Gap by Physics-Grounded Active Stereo Sensor Simulation},