
DIO: Decomposable Implicit 4D Occupancy-Flow World Model
Christopher Diehl*, Quinlan Sykora*, Ben Agro, Thomas Gilles, Sergio Casas, Raquel Urtasun
ICRA 2023
Ze Yang, Siva Manivasagam, Yun Chen, Jingkang Wang, Rui Hu, Raquel Urtasun
Given a camera video and LiDAR sweeps as input, our model reconstructs accurate geometry and surface properties, which can be used to synthesize realistic appearance under novel viewpoints using our physics-based radiance module, enabling realistic sensor simulation for self-driving.
NeuSim is composed of a structured neural surface representation and a physics-based reflectance model. This decomposed representation enables generalization to new views from sparse in-the-wild viewpoints. Given a continuous 3D location, NeuSim outputs the signed distance value of the point to object surface, the albedo and specular reflection. The signed distance value is used to derive the surface normal, which is then used to shade the diffuse and specular components to obtain the final RGB color. We also render the LiDAR depth and intensity, as well as object mask from the learned representation.
We can reconstruct 360° full shape from partial observations (left video). For each example, we use the red bounding box to annotate the vehicle of interest on the left. We also show the reconstructed vehicle mesh on the right.
We can reconstruct 360° full shape by applying structural priors such as symmetry, allowing photorealistic rendering from arbitrary viewpoints.
When testing on novel viewpoints, our approach generalizes better to large viewpoint change compared to other methods, demonstrating the value of our physics-based reflectance model. Our method also captures more fine-grained details and accurate colors.
Our method also works on non-vehicle objects, such as a moped with tiny handlebars, or a thin wooden scaffold construction blended in with background.
The reconstructed assets can be inserted into existing scenes for generating new scenarios for self-driving simulation. Because our assets are consistent across sensors, we can generate realistically the LiDAR point clouds (on top) and the camera images (on bottom) for the modified scene. The left video demonstrates the manipulation of the inserted actor, while the right video showcases the actor aggressively merging into our lane.
@inproceedings{yang2023reconstructing,
title = {Reconstructing Objects in-the-wild for Realistic Sensor Simulation},
author = {Yang, Ze and Manivasagam, Sivabalan and Chen, Yun and Wang, Jingkang and Hu, Rui and Urtasun, Raquel},
booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
year = {2023},
}
Christopher Diehl*, Quinlan Sykora*, Ben Agro, Thomas Gilles, Sergio Casas, Raquel Urtasun
Ben Agro, Sergio Casas, Patrick Wang, Thomas Gilles, Raquel Urtasun
Ze Yang, Jingkang Wang, Haowei Zhang, Sivabalan Manivasagam, Yun Chen, Raquel Urtasun
Yun Chen*, Matthew Haines*十, Jingkang Wang, Krzysztof Baron-Lis, Sivabalan Manivasagam, Ze Yang, Raquel Urtasun
UniCal: Unified Neural Sensor Calibration
Chris Zhang, Sourav Biswas, Kelvin Wong, Kion Fallah, Lunjun Zhang, Dian Chen, Sergio Casas, Raquel Urtasun
Yun Chen*, Jingkang Wang*, Ze Yang, Sivabalan Manivasagam, Raquel Urtasun
Sergio Casas*, Ben Agro*, Jiageng Mao*十, Thomas Gilles, Alexander Cui十, Thomas Li, Raquel Urtasun
Sergio Casas*, Ben Agro*, Jiageng Mao*十, Thomas Gilles, Alexander Cui十, Thomas Li, Raquel Urtasun
Jack Lu†*, Kelvin Wong*, Chris Zhang, Simon Suo, Raquel Urtasun