Deep Multi-Branch Aggregation Network for Real-Time Semantic Segmentation in Street Scenes

被引:24
|
作者
Weng, Xi [1 ]
Yan, Yan [1 ]
Dong, Genshun [1 ]
Shu, Chang [2 ]
Wang, Biao [3 ]
Wang, Hanzi [1 ]
Zhang, Ji [3 ,4 ]
机构
[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen 361005, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
[3] Zhejiang Lab, Hangzhou 311101, Peoples R China
[4] Univ Southern Queensland, Sch Math Phys & Comp, Toowoomba, Qld 4350, Australia
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Semantics; Real-time systems; Image segmentation; Lattices; Decoding; Task analysis; Feature extraction; Deep learning; real-time semantic segmentation; lightweight convolutional neural networks; multi-branch aggregation;
D O I
10.1109/TITS.2022.3150350
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Real-time semantic segmentation, which aims to achieve high segmentation accuracy at real-time inference speed, has received substantial attention over the past few years. However, many state-of-the-art real-time semantic segmentation methods tend to sacrifice some spatial details or contextual information for fast inference, thus leading to degradation in segmentation quality. In this paper, we propose a novel Deep Multi-branch Aggregation Network (called DMA-Net) based on the encoder-decoder structure to perform real-time semantic segmentation in street scenes. Specifically, we first adopt ResNet-18 as the encoder to efficiently generate various levels of feature maps from different stages of convolutions. Then, we develop a Multi-branch Aggregation Network (MAN) as the decoder to effectively aggregate different levels of feature maps and capture the multi-scale information. In MAN, a lattice enhanced residual block is designed to enhance feature representations of the network by taking advantage of the lattice structure. Meanwhile, a feature transformation block is introduced to explicitly transform the feature map from the neighboring branch before feature aggregation. Moreover, a global context block is used to exploit the global contextual information. These key components are tightly combined and jointly optimized in a unified network. Extensive experimental results on the challenging Cityscapes and CamVid datasets demonstrate that our proposed DMA-Net respectively obtains 77.0% and 73.6% mean Intersection over Union (mIoU) at the inference speed of 46.7 FPS and 119.8 FPS by only using a single NVIDIA GTX 1080Ti GPU. This shows that DMA-Net provides a good tradeoff between segmentation quality and speed for semantic segmentation in street scenes.
引用
收藏
页码:17224 / 17240
页数:17
相关论文
共 50 条
  • [1] Deep Multi-Resolution Network for Real-Time Semantic Segmentation in Street Scenes
    Wang, Yalun
    Chen, Shidong
    Bian, Huicong
    Li, Weixiao
    Lu, Qin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [2] Real-time semantic segmentation network for crops and weeds based on multi-branch structure
    Liu, Yufan
    Liu, Muhua
    Zhao, Xuhui
    Zhu, Junlong
    Wang, Lin
    Ma, Hao
    Zhang, Mingchuan
    IET COMPUTER VISION, 2024, 18 (08) : 1313 - 1324
  • [3] MDRNet: a lightweight network for real-time semantic segmentation in street scenes
    Dai, Yingpeng
    Wang, Junzheng
    Li, Jiehao
    Li, Jing
    ASSEMBLY AUTOMATION, 2021, 41 (06) : 725 - 733
  • [4] Reconsidering Multi-Branch Aggregation for Semantic Segmentation
    Cai, Pengjie
    Yang, Derong
    Zou, Yonglin
    Chen, Ruihan
    Dai, Ming
    ELECTRONICS, 2023, 12 (15)
  • [5] Multi-directional feature refinement network for real-time semantic segmentation in urban street scenes
    Zhou, Yan
    Zheng, Xihong
    Yang, Yin
    Li, Jianxun
    Mu, Jinzhen
    Irampaye, Richard
    IET COMPUTER VISION, 2023, 17 (04) : 431 - 444
  • [6] Gated feature aggregate and alignment network for real-time semantic segmentation of street scenes
    Liu, Qian
    Li, Zhensheng
    Qi, Youwei
    Wang, Cunbao
    MULTIMEDIA SYSTEMS, 2024, 30 (04)
  • [7] Triple-Branch Asymmetric Network for Real-time Semantic Segmentation of Road Scenes
    Yazhi Zhang
    Xuguang Zhang
    Hui Yu
    Instrumentation, 2024, 11 (02) : 72 - 82
  • [8] A spatial-frequency domain multi-branch decoder method for real-time semantic segmentation
    Deng, Liwei
    Wu, Boda
    Chen, Songyu
    Li, Dongxue
    Fang, Yanze
    IMAGE AND VISION COMPUTING, 2025, 156
  • [9] Stage-Aware Feature Alignment Network for Real-Time Semantic Segmentation of Street Scenes
    Weng, Xi
    Yan, Yan
    Chen, Si
    Xue, Jing-Hao
    Wang, Hanzi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4444 - 4459
  • [10] Real-time Semantic Segmentation with Context Aggregation Network
    Yang, Michael Ying
    Kumaar, Saumya
    Lyu, Ye
    Nex, Francesco
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 178 : 124 - 134