
DIO: Decomposable Implicit 4D Occupancy-Flow World Model
Christopher Diehl*, Quinlan Sykora*, Ben Agro, Thomas Gilles, Sergio Casas, Raquel Urtasun
Yun Chen*, Matthew Haines*十, Jingkang Wang, Krzysztof Baron-Lis, Sivabalan Manivasagam, Ze Yang, Raquel Urtasun
Autonomous driving development requires high-fidelity sensor simulation, but existing approaches trade-off between computational efficiency and sensor modeling capabilities. Current methods either excel at camera simulation or LiDAR rendering, but not both, while advanced sensor effects typically come at the cost of real-time performance. SaLF addresses this gap by introducing a unified volumetric representation that enables real-time rendering of multiple sensor types while supporting sophisticated sensor modeling capabilities, ultimately accelerating autonomous driving development through more realistic and efficient simulation.
SaLF represents scenes as a sparse grid of voxel primitives where each voxel contains a local implicit field mapping 3D coordinates to density and color. It uses adaptive pruning and densification to efficiently handle large scenes while preserving fine details. Each voxel has geometric parameters (position, scale, rotation) and learnable parameters (geometry field, color field, spherical harmonics).
SaLF supports dual rendering:
Ray-casting with octree acceleration for complex sensors and effects like refraction and shadows
Tile-based splatting for efficient pinhole camera rendering
This unified approach allows choosing the optimal rendering method based on sensor type while maintaining consistent visual quality.
SaLF achieves high photorealism on complex urban driving scenes, reconstructing them rapidly (under 30 minutes). The resulting representation can be rendered in real-time (>30 FPS) from novel viewpoints, handling diverse backgrounds, traffic participants, and lighting conditions.
In addition to camera, SaLF also efficiently simulates LiDAR sensors, generating realistic point clouds at high speeds (>400 FPS).
SaLF’s flexible representation enables simulation of diverse camera models beyond standard pinhole cameras. Here we demonstrate rendering 360° panoramic views, crucial for simulating surround-view systems and providing complete environmental awareness.
Accurate sensor simulation requires modeling physical effects like rolling shutter. SaLF captures the temporal distortion artifacts common in SDV sensors, especially visible in dynamic scenes with relative motion between the sensor and objects, essential for realistic simulation in high-speed scenarios.
Leveraging its ray-casting rendering path, SaLF can simulate complex light transport phenomena. This includes effects like refraction, reflections, and shadows, enhancing realism.
SaLF offers a fast, realistic, and versatile way to simulate self-driving sensors like cameras and LiDAR. By uniquely supporting both rasterization and ray-tracing in one sparse voxel format, it achieves real-time speeds, handles complex sensors and effects, and trains much faster, enabling more scalable and comprehensive sensor simulation in self-driving.
@article{
chen2025salf,
title={SaLF: Sparse Local Fields for Multi-Sensor Rendering in Real-Time},
author={Chen, Yun and Haines, Matthew and Wang, Jingkang and Baron-Lis, Krzysztof and Manivasagam, Sivabalan and Yang, Ze and Urtasun, Raquel},
booktitle={Arxiv},
year={2025},
}
Christopher Diehl*, Quinlan Sykora*, Ben Agro, Thomas Gilles, Sergio Casas, Raquel Urtasun
Ben Agro, Sergio Casas, Patrick Wang, Thomas Gilles, Raquel Urtasun
Ze Yang, Jingkang Wang, Haowei Zhang, Sivabalan Manivasagam, Yun Chen, Raquel Urtasun
UniCal: Unified Neural Sensor Calibration
Chris Zhang, Sourav Biswas, Kelvin Wong, Kion Fallah, Lunjun Zhang, Dian Chen, Sergio Casas, Raquel Urtasun
Yun Chen*, Jingkang Wang*, Ze Yang, Sivabalan Manivasagam, Raquel Urtasun
Sergio Casas*, Ben Agro*, Jiageng Mao*十, Thomas Gilles, Alexander Cui十, Thomas Li, Raquel Urtasun
Sergio Casas*, Ben Agro*, Jiageng Mao*十, Thomas Gilles, Alexander Cui十, Thomas Li, Raquel Urtasun
Jack Lu†*, Kelvin Wong*, Chris Zhang, Simon Suo, Raquel Urtasun