Chongjian GE*,
Junsong Chen*,
Enze Xie+,
Zhongdao Wang,
Lanqing Hong,
Huchuan Lu,
Zhenguo Li,
Ping Luo+
(* denotes equal contribution, + denotes corresponding authors)
- (20/04/2023) MetaBEV is released on arxiv.
Perception systems in modern autonomous driving vehicles typically take inputs from complementary multi-modal sensors, e.g., LiDAR and cameras. However, in real-world applications, sensor corruptions and failures lead to inferior performances, thus compromising autonomous safety.
In this paper, we propose a robust framework, called MetaBEV, to address extreme real-world environments, involving overall six sensor corruptions and two extreme sensor-missing situations.
Experiments show MetaBEV outperforms prior arts by a large margin on both full and corrupted modalities. For instance, when the LiDAR signal is missing, MetaBEV improves 35.5% detection NDS and 17.7% segmentation mIoU upon the vanilla BEVFusion model; and when the camera signal is absent, MetaBEV still achieves 69.2% NDS and 53.7% mIoU, which is even higher than previous works that perform on full-modalities. Moreover, MetaBEV performs fairly against previous methods in both canonical perception and multi-task learning settings, refreshing state-of-the-art nuScenes BEV map segmentation with 70.4% mIoU.
Our model achieves the following performance on :
- Detection on nuScenes val set with LiDAR and Camera.
Methods | Modality | Multi-Task | mAP(val) | NDS(val) |
---|---|---|---|---|
MetaBEV-Transfusion | Camera | x | 49.4 | 49.7 |
MetaBEV-Centerhead | Camera | x | 55.5 | 60.4 |
MetaBEV-Transfusion | LiDAR | x | 62.5 | 68.6 |
MetaBEV-Centerhead | LiDAR | x | 64.2 | 69.3 |
MetaBEV-Transfusion | Camera+LiDAR | x | 68 | 71.5 |
MetaBEV-Transfusion | Camera+LiDAR | √ | 65.4 | 69.8 |
- Segmentation on nuScenes val set with LiDAR and Camera.
Methods | Modality | Drivable | Ped.Cross | Walkway | Stop Line | Carpark | Divider | Mean |
---|---|---|---|---|---|---|---|---|
MetaBEV | Camera | 83.3 | 56.7 | 61.4 | 50.8 | 55.5 | 48 | 59.3 |
MetaBEV | LiDAR | 87.9 | 63.4 | 71.6 | 55 | 55.1 | 55.7 | 64.8 |
MetaBEV | Camera+LiDAR | 89.6 | 68.4 | 74.8 | 63.3 | 64.4 | 61.8 | 70.4 |
MetaBEV | Camera+LiDAR | 88.5 | 64.9 | 71.8 | 56.7 | 61.1 | 58.2 | 66.9 |
Methods | Camera+LiDAR | Missing Camera | Missing LiDAR | ||||||
mAP | NDS | mIoU | mAP | NDS | mIoU | mAP | NDS | mIoU | |
MetaBEV | 68.0 | 71.5 | 70.4 | 63.6 | 69.2 | 53.7 | 39.0 | 42.6 | 54.4 |
The project is based on mmdetection3d, BEVFusion, robust benchmark. Thanks for their awesome works.
This project is under the MIT license. See LICENSE for details.
If you find MetaBEV useful or relevant in your research please consider citing our paper:
@article{ge2023metabev,
title={MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation},
author={Ge, Chongjian and Chen, Junsong and Xie, Enze and Wang, Zhongdao and Hong, Lanqing and Lu, Huchuan and Li, Zhenguo and Luo, Ping},
journal={arXiv preprint arXiv:2304.09801},
year={2023}
}