Automatic Network Architecture Search for RGB-D Semantic Segmentation

被引：2

作者：

Wang, Wenna ^{[1
]}

Zhuo, Tao ^{[2
]}

Zhang, Xiuwei ^{[1
]}

Sun, Mingjun ^{[1
]}

Yin, Hanlin ^{[1
]}

Xing, Yinghui ^{[1
]}

Zhang, Yanning ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Xian, Peoples R China

[2] Qilu Univ Technol, Shandong Acad Sci, Shandong Artificial Intelligence Inst, Jinan, Peoples R China

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

RGB-D semantic segmentation; NAS; grid-like network-level search space; hierarchical cell-level search space; search strategy;

D O I：

10.1145/3581783.3612288

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent RGB-D semantic segmentation networks are usually manually designed. However, due to limited human efforts and time costs, their performance might be inferior for complex scenarios. To address this issue, we propose the first Neural Architecture Search (NAS) method that designs the network automatically. Specifically, the target network consists of an encoder and a decoder. The encoder is designed with two independent branches, where each branch specializes in extracting features from RGB and depth images, respectively. The decoder fuses the features and generates the final segmentation result. Besides, for automatic network design, we design a grid-like network-level search space combined with a hierarchical cell-level search space. By further developing an effective gradient-based search strategy, the network structure with hierarchical cell architectures is discovered. Extensive results on two datasets show that the proposed method outperforms the state-of-the-art approaches, which achieves a mIoU score of 55.1% on the NYU-Depth v2 dataset and 50.3% on the SUN-RGBD dataset.

引用

页码：3777 / 3786

页数：10

共 50 条

[31] MGCNet: Multilevel Gated Collaborative Network for RGB-D Semantic Segmentation of Indoor Scene
Yang, Enquan
Zhou, Wujie
Qian, Xionghong
Yu, Lu
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2567 - 2571
[32] FGMNet: Feature grouping mechanism network for RGB-D indoor scene semantic segmentation
Zhang, Yuming
Zhou, Wujie
Ye, Lv
Yu, Lu
Luo, Ting
DIGITAL SIGNAL PROCESSING, 2024, 149
[33] RGB×D: Learning depth-weighted RGB patches for RGB-D indoor semantic segmentation
Cao, Jinming
Leng, Hanchao
Cohen-Or, Daniel
Lischinski, Dani
Chen, Ying
Tu, Changhe
Li, Yangyan
Neurocomputing, 2021, 462 : 568 - 580
[34] Regularized Fully Convolutional Networks for RGB-D Semantic Segmentation
Su, Wen
Wang, Zengfu
2016 30TH ANNIVERSARY OF VISUAL COMMUNICATION AND IMAGE PROCESSING (VCIP), 2016,
[35] Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation
Zhu, Xingyu
Wang, Xin
Freer, Jonathan
Chang, Hyung Jin
Gao, Yixing
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9471 - 9477
[36] Small Obstacle Avoidance Based on RGB-D Semantic Segmentation
Hua, Minjie
Nan, Yibing
Lian, Shiguo
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 886 - 894
[37] Accurate semantic segmentation of RGB-D images for indoor navigation
Sharan, Sudeep
Nauth, Peter
Dominguez-Jimenez, Juan-Jose
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
[38] Non-Local Aggregation for RGB-D Semantic Segmentation
Zhang, Guodong
Xue, Jing-Hao
Xie, Pengwei
Yang, Sifan
Wang, Guijin
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 658 - 662
[39] Learning Strengths and Weaknesses of Classifiers for RGB-D Semantic Segmentation
Fooladgar, Fahimeh
Kasaei, Shohreh
2015 9TH IRANIAN CONFERENCE ON MACHINE VISION AND IMAGE PROCESSING (MVIP), 2015, : 176 - 179
[40] Semantic segmentation with Recurrent Neural Networks on RGB-D videos
Gao, Chuan
Wang, Weihong
Chen, Mingxi
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1203 - 1207

← 1 2 3 4 5 →