UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation

Yuwen Xiong, Wei-Chiu Ma, Jingkang Wang, Raquel Urtasun

Conference: CVPR 2023

Categories: Autonomy, Perception & Motion Forecasting, Digital Twins, Sensor Simulation

Video

PDF

Supplementary

Abstract

LiDAR provides accurate geometric measurements of the 3D world. Unfortunately, dense LiDARs are very expensive and the point clouds captured by low-beam LiDAR are often sparse. To address these issues, we present UltraLiDAR, a data-driven framework for scene-level LiDAR completion, LiDAR generation, and LiDAR manipulation. The crux of UltraLiDAR is a compact, discrete representation that encodes the point cloud’s geometric structure, is robust to noise, and is easy to manipulate. We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds as if they were captured by a real high-density LiDAR, drastically reducing the cost. Furthermore, by learning a prior over the discrete codebook, we can generate diverse, realistic LiDAR point clouds for self-driving. We evaluate the effectiveness of UltraLiDAR on sparse-to-dense LiDAR completion and LiDAR generation. Experiments show that densifying real-world point clouds with our approach can significantly improve the performance of downstream perception systems. Compared to prior art on LiDAR generation, our approach generates much more realistic point clouds. According to A/B test, over 98.5% of the time human participants prefer our results over those of previous methods.

Overview

UltraLiDAR learns discrete representations from large-scale LiDAR point clouds and performs realistic, scalable and controllable LiDAR completion and generation. Top row: Sparse-to-dense LiDAR completion; Second row: Controllable manipulation of real LiDAR with actor removal and insertion; Third row: Diverse LiDAR generation with realistic global structure and fine-grained details; Bottom row: Conditional scene generation with partially observed point clouds.

Video

Play with sound.

Sparse-to-Dense LiDAR Completion

Our densified point clouds preserves the original structure of the sparse input and is able to recover the geometry of the vehicles. The results are consistent across time even if we reconstruct each frame individually.

Unconditional LiDAR Generation

An illustration of the per-step generation process, we can perform this generation at scale.

Method

For (a) LiDAR completion, the sparse encoder maps the sparse point cloud to discrete codes, and the dense decoder reconstructs dense data from them; For (b) LiDAR generation, the transformer model starts from a blank canvas or canvas with codes mapped from the partial observations; and iteratively predicts and updates the missing parts. The decoder produces the LiDAR output given the predicted code as the generation results.

Comparison with other methods

We plan to release the results we used for A/B test to facilitate future research.

Generation Results

We show some conditional and unconditional generation results from the model trained on KITTI-360 data.

BibTeX

@inproceedings{xiong2023learning,
  title     = {Learning Compact Representations for LiDAR Completion and Generation},
  author    = {Xiong, Yuwen and Ma, Wei-Chiu and Wang, Jingkang and Urtasun, Raquel},
  booktitle = {CVPR},
  year      = {2023},
}

UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation