LidarMultiNet: Towards a Unified Multi-Task Network for LiDAR Perception

被引:0
|
作者
Ye, Dongqiangzi [1 ]
Zhou, Zixiang [1 ,2 ]
Chen, Weijia [1 ]
Xie, Yufei [1 ]
Wang, Yu [1 ]
Wang, Panqu [1 ]
Foroosh, Hassan [2 ]
机构
[1] TuSimple, San Diego, CA 92122 USA
[2] Univ Cent Florida, Orlando, FL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LiDAR-based 3D object detection, semantic segmentation, and panoptic segmentation are usually implemented in specialized networks with distinctive architectures that are difficult to adapt to each other. This paper presents LidarMultiNet, a LiDAR-based multi-task network that unifies these three major LiDAR perception tasks. Among its many benefits, a multi-task network can reduce the overall cost by sharing weights and computation among multiple tasks. However, it typically underperforms compared to independently combined single-task models. The proposed LidarMultiNet aims to bridge the performance gap between the multi-task network and multiple single-task networks. At the core of LidarMultiNet is a strong 3D voxel-based encoder-decoder architecture with a Global Context Pooling (GCP) module extracting global contextual features from a LiDAR frame. Task-specific heads are added on top of the network to perform the three LiDAR perception tasks. More tasks can be implemented simply by adding new task-specific heads while introducing little additional cost. A second stage is also proposed to refine the first-stage segmentation and generate accurate panoptic segmentation results. LidarMultiNet is extensively tested on both Waymo Open Dataset and nuScenes dataset, demonstrating for the first time that major LiDAR perception tasks can be unified in a single strong network that is trained end-to-end and achieves state-of-the-art performance. Notably, LidarMultiNet reaches the official 1st place in the Waymo Open Dataset 3D semantic segmentation challenge 2022 with the highest mIoU and the best accuracy for most of the 22 classes on the test set, using only LiDAR points as input. It also sets the new state-of-the-art for a single model on the Waymo 3D object detection benchmark and three nuScenes benchmarks.
引用
收藏
页码:3231 / 3240
页数:10
相关论文
共 50 条
  • [31] Integrated Perception with Recurrent Multi-Task Neural Networks
    Bilen, Hakan
    Vedaldi, Andrea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [32] A Unified Multi-Task Semantic Communication System for Multimodal Data
    Zhang, Guangyi
    Hu, Qiyu
    Qin, Zhijin
    Cai, Yunlong
    Yu, Guanding
    Tao, Xiaoming
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2024, 72 (07) : 4101 - 4116
  • [33] A Unified Multi-task Adversarial Learning Framework for Pharmacovigilance Mining
    Yadav, Shweta
    Ekbal, Asif
    Saha, Sriparna
    Bhattacharyya, Pushpak
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5234 - 5245
  • [34] A Unified Multi-Task Semantic Communication System with Domain Adaptation
    Zhang, Guangyi
    Hu, Qiyu
    Qin, Zhijin
    Cai, Yunlong
    Yu, Guanding
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 3971 - 3976
  • [35] FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration
    Huang, Zhijian
    Lin, Sihao
    Liu, Guiyu
    Luo, Mukun
    Ye, Chaoqiang
    Xu, Hang
    Chang, Xiaojun
    Liang, Xiaodan
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3479 - 3488
  • [36] A Decoupled Multi-Task Network for Shadow Removal
    Liu, Jiawei
    Wang, Qiang
    Fan, Huijie
    Li, Wentao
    Qu, Liangqiong
    Tang, Yandong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9449 - 9463
  • [37] MIND: Multi-Task Incremental Network Distillation
    Bonato, Jacopo
    Pelosin, Francesco
    Sabetta, Luigi
    Nicolosi, Alessandro
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11105 - 11113
  • [38] Multi-Task Metric Learning on Network Data
    Fang, Chen
    Rockmore, Daniel N.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART I, 2015, 9077 : 317 - 329
  • [39] MTFuzz: Fuzzing with a Multi-task Neural Network
    She, Dongdong
    Krishna, Rahul
    Yan, Lu
    Jana, Suman
    Ray, Baishakhi
    PROCEEDINGS OF THE 28TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '20), 2020, : 737 - 749
  • [40] Distributed Multi-task Learning for Sensor Network
    Li, Jiyi
    Arai, Tomohiro
    Baba, Yukino
    Kashima, Hisashi
    Miwa, Shotaro
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT II, 2017, 10535 : 657 - 672