NLFNet: Non-Local Fusion Towards Generalized Multimodal Semantic Segmentation across RGB-Depth, Polarization, and Thermal Images

被引：13

作者：

Yan, Ran ^{[1
]}

Yang, Kailun ^{[2
]}

Wang, Kaiwei ^{[3
]}

机构：

[1] Zhejiang Univ, State Key Lab Modern Opt Instrumentat, Hangzhou, Peoples R China

[2] Karlsruhe Inst Technol, Inst Anthropomat & Robot, Karlsruhe, Germany

[3] Zhejiang Univ, Natl Opt Instrumentat Engn Technol Res Ctr, Hangzhou, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE-ROBIO 2021) | 2021年

关键词：

D O I：

10.1109/ROBIO54168.2021.9739390

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, intelligent driving navigation has made considerable progress, and semantic segmentation is one of the most advanced scene perception methods. At present, traditional semantic segmentation methods can use RGB images for detection of obstacles that are clearly visible in outdoor scenes. However, in the face of complex realistic driving scenes, RGB images cannot provide sufficient information. We need some other modal information to supplement the RGB information. In this paper, we propose Non-Local Fusion Network (NLFNet), which is a semantic segmentation network that can selectively fuse multimodal input information in an adaptive manner. It can use complementary information collected by different optical sensors to extract effective features for fusion. Thereby, it improves the segmentation accuracy of the network and solves the problem of object recognition in various challenging real-world scenes. We conduct comprehensive experiments to verify the effectiveness and generalization ability of the framework across RGB-Depth, RGB-Polarization, and RGB-Thermal image semantic segmentation, which is especially suitable for autonomous driving and robot vision applications.

引用

页码：1129 / 1135

页数：7

共 4 条

[1] Non-Local Aggregation for RGB-D Semantic Segmentation
Zhang, Guodong
Xue, Jing-Hao
Xie, Pengwei
Yang, Sifan
Wang, Guijin
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 658 - 662
[2] An Iterative, Non-local Approach for Restoring Depth Maps in RGB-D Images
Bapat, Akash
Ravi, Adit
Raman, Shanmuganathan
2015 TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2015,
[3] TSTR: A Real-Time RGB-Thermal Semantic Segmentation Model with Multimodal Fusion Transformers
Zhao, Guogiang
Yan, Xiaoyun
Cui, Aodie
Hu, Chang
Bao, Jiaqi
Huang, Junjie
2023 19TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN 2023, 2023, : 588 - 595
[4] Dual-modal non-local context guided multi-stage fusion for indoor RGB-D semantic segmentation
Guo, Xiangyu
Ma, Wei
Liang, Fangfang
Mi, Qing
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255

← 1 →