NLFNet: Non-Local Fusion Towards Generalized Multimodal Semantic Segmentation across RGB-Depth, Polarization, and Thermal Images

被引:13
|
作者
Yan, Ran [1 ]
Yang, Kailun [2 ]
Wang, Kaiwei [3 ]
机构
[1] Zhejiang Univ, State Key Lab Modern Opt Instrumentat, Hangzhou, Peoples R China
[2] Karlsruhe Inst Technol, Inst Anthropomat & Robot, Karlsruhe, Germany
[3] Zhejiang Univ, Natl Opt Instrumentat Engn Technol Res Ctr, Hangzhou, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE-ROBIO 2021) | 2021年
关键词
D O I
10.1109/ROBIO54168.2021.9739390
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, intelligent driving navigation has made considerable progress, and semantic segmentation is one of the most advanced scene perception methods. At present, traditional semantic segmentation methods can use RGB images for detection of obstacles that are clearly visible in outdoor scenes. However, in the face of complex realistic driving scenes, RGB images cannot provide sufficient information. We need some other modal information to supplement the RGB information. In this paper, we propose Non-Local Fusion Network (NLFNet), which is a semantic segmentation network that can selectively fuse multimodal input information in an adaptive manner. It can use complementary information collected by different optical sensors to extract effective features for fusion. Thereby, it improves the segmentation accuracy of the network and solves the problem of object recognition in various challenging real-world scenes. We conduct comprehensive experiments to verify the effectiveness and generalization ability of the framework across RGB-Depth, RGB-Polarization, and RGB-Thermal image semantic segmentation, which is especially suitable for autonomous driving and robot vision applications.
引用
收藏
页码:1129 / 1135
页数:7
相关论文
共 4 条
  • [1] Non-Local Aggregation for RGB-D Semantic Segmentation
    Zhang, Guodong
    Xue, Jing-Hao
    Xie, Pengwei
    Yang, Sifan
    Wang, Guijin
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 658 - 662
  • [2] An Iterative, Non-local Approach for Restoring Depth Maps in RGB-D Images
    Bapat, Akash
    Ravi, Adit
    Raman, Shanmuganathan
    2015 TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2015,
  • [3] TSTR: A Real-Time RGB-Thermal Semantic Segmentation Model with Multimodal Fusion Transformers
    Zhao, Guogiang
    Yan, Xiaoyun
    Cui, Aodie
    Hu, Chang
    Bao, Jiaqi
    Huang, Junjie
    2023 19TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN 2023, 2023, : 588 - 595
  • [4] Dual-modal non-local context guided multi-stage fusion for indoor RGB-D semantic segmentation
    Guo, Xiangyu
    Ma, Wei
    Liang, Fangfang
    Mi, Qing
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255