Encoder-Decoder Structure Fusing Depth Information for Outdoor Semantic Segmentation

被引:1
|
作者
Chen, Songnan [1 ]
Tang, Mengxia [2 ]
Dong, Ruifang [2 ]
Kan, Jiangming [2 ]
机构
[1] Wuhan Polytech Univ, Sch Math & Comp Sci, 36 Huanhu Middle Rd, Wuhan 430048, Peoples R China
[2] Beijing Forestry Univ, Sch Technol, 35 Qinghua East Rd, Beijing 100083, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 17期
关键词
semantic segmentation; RGB-D image; predicted depth map; fusion structure; feature pyramid; NETWORK;
D O I
10.3390/app13179924
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The semantic segmentation of outdoor images is the cornerstone of scene understanding and plays a crucial role in the autonomous navigation of robots. Although RGB-D images can provide additional depth information for improving the performance of semantic segmentation tasks, current state-of-the-art methods directly use ground truth depth maps for depth information fusion, which relies on highly developed and expensive depth sensors. Aiming to solve such a problem, we proposed a self-calibrated RGB-D image semantic segmentation neural network model based on an improved residual network without relying on depth sensors, which utilizes multi-modal information from depth maps predicted with depth estimation models and RGB image fusion for image semantic segmentation to enhance the understanding of a scene. First, we designed a novel convolution neural network (CNN) with an encoding and decoding structure as our semantic segmentation model. The encoder was constructed using IResNet to extract the semantic features of the RGB image and the predicted depth map and then effectively fuse them with the self-calibration fusion structure. The decoder restored the resolution of the output features with a series of successive upsampling structures. Second, we presented a feature pyramid attention mechanism to extract the fused information at multiple scales and obtain features with rich semantic information. The experimental results using the publicly available Cityscapes dataset and collected forest scene images show that our model trained with the estimated depth information can achieve comparable performance to the ground truth depth map in improving the accuracy of the semantic segmentation task and even outperforming some competitive methods.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Optimizing FPGA-based Convolutional Encoder-Decoder Architecture for Semantic Segmentation
    Yu, Mengqi
    Huang, Hongzhi
    Liu, Hong
    He, Shuyi
    Qiao, Fei
    Luo, Li
    Xie, Fugui
    Liu, Xin-Jun
    Yang, Huazhong
    2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 1436 - 1440
  • [32] ENDE-GNN: An Encoder-decoder GNN Framework for Sketch Semantic Segmentation
    Zheng, Yixiao
    Xie, Jiyang
    Sain, Aneeshan
    Ma, Zhanyu
    Song, Yi-Zhe
    Guo, Jun
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [33] Convolutional neural network based encoder-decoder architectures for semantic segmentation of plants
    Kolhar, Shrikrishna
    Jagtap, Jayant
    ECOLOGICAL INFORMATICS, 2021, 64
  • [34] DXNet: An Encoder-Decoder Architecture with XSPP for Semantic Image Segmentation in Street Scenes
    Shang, Yexin
    Zhong, Shan
    Gong, Shengrong
    Zhou, Lifan
    Ying, Wenhao
    NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 550 - 557
  • [35] GeoSegNet: point cloud semantic segmentation via geometric encoder-decoder modeling
    Chen, Chen
    Wang, Yisen
    Chen, Honghua
    Yan, Xuefeng
    Ren, Dayong
    Guo, Yanwen
    Xie, Haoran
    Wang, Fu Lee
    Wei, Mingqiang
    VISUAL COMPUTER, 2024, 40 (08): : 5107 - 5121
  • [36] Weed detection in precision agriculture: leveraging encoder-decoder models for semantic segmentation
    Thiagarajan S.
    Vijayalakshmi A.
    Grace G.H.
    Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (9) : 3547 - 3561
  • [37] Encoder-decoder network with RMP for tongue segmentation
    Kusakunniran, Worapan
    Borwarnginn, Punyanuch
    Karnjanapreechakorn, Sarattha
    Thongkanchorn, Kittikhun
    Ritthipravat, Panrasee
    Tuakta, Pimchanok
    Benjapornlert, Paitoon
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (05) : 1193 - 1207
  • [38] Encoder-decoder network with RMP for tongue segmentation
    Worapan Kusakunniran
    Punyanuch Borwarnginn
    Sarattha Karnjanapreechakorn
    Kittikhun Thongkanchorn
    Panrasee Ritthipravat
    Pimchanok Tuakta
    Paitoon Benjapornlert
    Medical & Biological Engineering & Computing, 2023, 61 : 1193 - 1207
  • [39] Retinal vessel image segmentation algorithm based on encoder-decoder structure
    ZhengLi Zhai
    Shu Feng
    Luyao Yao
    Penghui Li
    Multimedia Tools and Applications, 2022, 81 : 33361 - 33373
  • [40] SDDS-Net: Space and Depth Encoder-Decoder Convolutional Neural Networks for Real-Time Semantic Segmentation
    Ibrahem, Hatem
    Salem, Ahmed
    Kang, Hyun-Soo
    IEEE ACCESS, 2023, 11 : 119362 - 119372