LidarMultiNet: Towards a Unified Multi-Task Network for LiDAR Perception

被引:0
|
作者
Ye, Dongqiangzi [1 ]
Zhou, Zixiang [1 ,2 ]
Chen, Weijia [1 ]
Xie, Yufei [1 ]
Wang, Yu [1 ]
Wang, Panqu [1 ]
Foroosh, Hassan [2 ]
机构
[1] TuSimple, San Diego, CA 92122 USA
[2] Univ Cent Florida, Orlando, FL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LiDAR-based 3D object detection, semantic segmentation, and panoptic segmentation are usually implemented in specialized networks with distinctive architectures that are difficult to adapt to each other. This paper presents LidarMultiNet, a LiDAR-based multi-task network that unifies these three major LiDAR perception tasks. Among its many benefits, a multi-task network can reduce the overall cost by sharing weights and computation among multiple tasks. However, it typically underperforms compared to independently combined single-task models. The proposed LidarMultiNet aims to bridge the performance gap between the multi-task network and multiple single-task networks. At the core of LidarMultiNet is a strong 3D voxel-based encoder-decoder architecture with a Global Context Pooling (GCP) module extracting global contextual features from a LiDAR frame. Task-specific heads are added on top of the network to perform the three LiDAR perception tasks. More tasks can be implemented simply by adding new task-specific heads while introducing little additional cost. A second stage is also proposed to refine the first-stage segmentation and generate accurate panoptic segmentation results. LidarMultiNet is extensively tested on both Waymo Open Dataset and nuScenes dataset, demonstrating for the first time that major LiDAR perception tasks can be unified in a single strong network that is trained end-to-end and achieves state-of-the-art performance. Notably, LidarMultiNet reaches the official 1st place in the Waymo Open Dataset 3D semantic segmentation challenge 2022 with the highest mIoU and the best accuracy for most of the 22 classes on the test set, using only LiDAR points as input. It also sets the new state-of-the-art for a single model on the Waymo 3D object detection benchmark and three nuScenes benchmarks.
引用
收藏
页码:3231 / 3240
页数:10
相关论文
共 50 条
  • [21] Network Clustering for Multi-task Learning
    Mu, Zhiying
    Gao, Dehong
    Guo, Sensen
    NEURAL PROCESSING LETTERS, 2025, 57 (01)
  • [22] AAGNet: A graph neural network towards multi-task machining feature recognition
    Wu, Hongjin
    Lei, Ruoshan
    Peng, Yibing
    Gao, Liang
    ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2024, 86
  • [23] Attentive Task Interaction Network for Multi-Task Learning
    Sinodinos, Dimitrios
    Armanfard, Narges
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2885 - 2891
  • [24] A Multi-Task Network Based on Dual-Neck Structure for Autonomous Driving Perception
    Tan, Guopeng
    Wang, Chao
    Li, Zhihua
    Zhang, Yuanbiao
    Li, Ruikai
    SENSORS, 2024, 24 (05)
  • [25] Towards Safe Multi-Task Bayesian Optimization
    Luebsen, Jannis O.
    Hespe, Christian
    Eichler, Annika
    6TH ANNUAL LEARNING FOR DYNAMICS & CONTROL CONFERENCE, 2024, 242 : 839 - 851
  • [26] YOLOPX: Anchor-free multi-task learning network for panoptic driving perception
    Zhan, Jiao
    Luo, Yarong
    Guo, Chi
    Wu, Yejun
    Meng, Jiawei
    Liu, Jingnan
    PATTERN RECOGNITION, 2024, 148
  • [27] Unified ICH quantification and prognosis prediction in NCCT images using a multi-task interpretable network
    Gong, Kai
    Dai, Qian
    Wang, Jiacheng
    Zheng, Yingbin
    Shi, Tao
    Yu, Jiaxing
    Chen, Jiangwang
    Huang, Shaohui
    Wang, Zhanxiang
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [28] Unified Autoencoder with Task Embeddings for Multi-Task Learning in Renewable Power Forecasting
    Nivarthi, Chandana Priya
    Vogt, Stephan
    Sick, Bernhard
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1530 - 1536
  • [29] Multi-Task Environmental Perception Methods for Autonomous Driving
    Liu, Ri
    Yang, Shubin
    Tang, Wansha
    Yuan, Jie
    Chan, Qiqing
    Yang, Yunchuan
    SENSORS, 2024, 24 (17)
  • [30] TriLiteNet: Lightweight Model for Multi-Task Visual Perception
    Che, Quang-Huy
    Lam, Duc-Khai
    IEEE ACCESS, 2025, 13 : 50152 - 50166