Encoder-Decoder Structure Fusing Depth Information for Outdoor Semantic Segmentation

被引:1
|
作者
Chen, Songnan [1 ]
Tang, Mengxia [2 ]
Dong, Ruifang [2 ]
Kan, Jiangming [2 ]
机构
[1] Wuhan Polytech Univ, Sch Math & Comp Sci, 36 Huanhu Middle Rd, Wuhan 430048, Peoples R China
[2] Beijing Forestry Univ, Sch Technol, 35 Qinghua East Rd, Beijing 100083, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 17期
关键词
semantic segmentation; RGB-D image; predicted depth map; fusion structure; feature pyramid; NETWORK;
D O I
10.3390/app13179924
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The semantic segmentation of outdoor images is the cornerstone of scene understanding and plays a crucial role in the autonomous navigation of robots. Although RGB-D images can provide additional depth information for improving the performance of semantic segmentation tasks, current state-of-the-art methods directly use ground truth depth maps for depth information fusion, which relies on highly developed and expensive depth sensors. Aiming to solve such a problem, we proposed a self-calibrated RGB-D image semantic segmentation neural network model based on an improved residual network without relying on depth sensors, which utilizes multi-modal information from depth maps predicted with depth estimation models and RGB image fusion for image semantic segmentation to enhance the understanding of a scene. First, we designed a novel convolution neural network (CNN) with an encoding and decoding structure as our semantic segmentation model. The encoder was constructed using IResNet to extract the semantic features of the RGB image and the predicted depth map and then effectively fuse them with the self-calibration fusion structure. The decoder restored the resolution of the output features with a series of successive upsampling structures. Second, we presented a feature pyramid attention mechanism to extract the fused information at multiple scales and obtain features with rich semantic information. The experimental results using the publicly available Cityscapes dataset and collected forest scene images show that our model trained with the estimated depth information can achieve comparable performance to the ground truth depth map in improving the accuracy of the semantic segmentation task and even outperforming some competitive methods.
引用
收藏
页数:17
相关论文
共 50 条
  • [11] A serial semantic segmentation model based on encoder-decoder architecture
    Zhou, Yan
    KNOWLEDGE-BASED SYSTEMS, 2024, 295
  • [12] Fast Real-time Semantic Segmentation Network with an Asymmetric Encoder-Decoder Structure
    Rui, Tang
    Yan, Li Hui
    Kai, Xu
    Yi, Ding
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 2408 - 2413
  • [13] J-Net: Asymmetric Encoder-Decoder for Medical Semantic Segmentation
    Shi, Yanli
    Sheng, Pengpeng
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [14] BANet: Boundary-Assistant Encoder-Decoder Network for Semantic Segmentation
    Zhou, Quan
    Qiang, Yong
    Mo, Yuwei
    Wu, Xiaofu
    Latecki, Longin Jan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 25259 - 25270
  • [15] Pooling Attention-based Encoder-Decoder Network for semantic segmentation
    Xu, Haixia
    Huang, Yunjia
    Hancock, Edwin R.
    Wang, Shuailong
    Xuan, Qijun
    Zhou, Wei
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 93
  • [16] A Residual Encoder-Decoder Network for Semantic Segmentation in Autonomous Driving Scenarios
    Naresh, Y. G.
    Little, Suzanne
    O'Connor, Noel E.
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1052 - 1056
  • [17] Binarized Encoder-Decoder Network and Binarized Deconvolution Engine for Semantic Segmentation
    Kim, Hyunwoo
    Kim, Jeonghoon
    Choi, Jungwook
    Lee, Jungkeol
    Song, Yong Ho
    IEEE ACCESS, 2021, 9 : 8006 - 8027
  • [18] SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
    Xie, Bin
    Cao, Jiale
    Xie, Jin
    Khan, Fahad Shahbaz
    Pang, Yanwei
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 3426 - 3436
  • [19] Residual quadratic encoder-decoder architecture for semantic segmentation of satellite images
    Bagwari, Neha
    Verma, Vivek Singh
    Kumar, Sushil
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [20] Semantic Segmentation of Large-Scale Outdoor Point Clouds by Encoder-Decoder Shared MLPs with Multiple Losses
    Rim, Beanbonyka
    Lee, Ahyoung
    Hong, Min
    REMOTE SENSING, 2021, 13 (16)