Aggregating transformers and CNNs for salient object detection in optical remote sensing images

被引：21

作者：

Bao, Liuxin ^{[1
]}

Zhou, Xiaofei ^{[1
]}

Zheng, Bolun ^{[1
]}

Yin, Haibing ^{[2
,3
]}

Zhu, Zunjie ^{[2
,3
]}

Zhang, Jiyong ^{[1
]}

Yan, Chenggang ^{[1
,2
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310018, Peoples R China

[2] Hangzhou Dianzi Univ, Lishui Inst, Lishui 323000, Peoples R China

[3] Hangzhou Dianzi Univ, Sch Commun Engn, Hangzhou 310018, Peoples R China

来源：

NEUROCOMPUTING | 2023年 / 553卷

基金：

中国国家自然科学基金;

关键词：

Transformer; CNNs; Feature fusion; Optical RSIs; Salient object detection; ENCODER-DECODER NETWORK; ATTENTION; FEATURES;

D O I：

10.1016/j.neucom.2023.126560

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Salient object detection (SOD) in optical remote sensing images (RSIs) plays a significant role in many areas such as agriculture, environmental protection, and the military. However, since the difference in imaging mode and image complexity between RSIs and natural scene images (NSIs), it is difficult to achieve remarkable results by directly extending the saliency method targeting NSIs to RSIs. Besides, we note that the convolutional neural networks (CNNs) based U-Net cannot effectively acquire the global long-range dependency, and the Transformer doesn't adequately characterize the spatial local details of each patch. Therefore, to conduct salient object detection in RSIs, we propose a novel two-branch architecture based network for Aggregating the Transformers and CNNs, namely ATC-Net, where the local spatial details and the global semantic information are fused into the final high-quality saliency map. Specifically, our saliency model adopts an encoder-decoder architecture including two parallel encoder branches and a decoder. Firstly, the two parallel encoder branches extract global and local features by using Transformer and CNNs, respectively. Then, the decoder employs a series of featureenhanced fusion (FF) modules to aggregate multi-level global and local features by interactive guidance and enhance the fused feature via attention mechanism. Finally, the decoder deploys the read out (RO) module to fuse the aggregated feature of FF module and the low-level CNN feature, steering the feature to focus more on spatial local details. Extensive experiments are performed on two public optical RSIs datasets, and the results show that our saliency model consistently outperforms 30 state-of-the-art methods.

引用

页数：14

共 50 条

[31] Global Perception Network for Salient Object Detection in Remote Sensing Images
Liu, Yu
Zhang, Shanwen
Wang, Zhen
Zhao, Baoping
Zou, Lincheng
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[32] Learning to Adapt Using Test-Time Images for Salient Object Detection in Optical Remote Sensing Images
Huang, Kan
Fang, Leyuan
Tian, Chunwei
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[33] A Survey of Object Detection in Optical Remote Sensing Images
Nie G.-T.
Huang H.
Huang, Hua (huahuang@bnu.edu.cn), 1749, Science Press (47): : 1749 - 1768
[34] A survey on object detection in optical remote sensing images
Cheng, Gong
Han, Junwei
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 117 : 11 - 28
[35] Progressive Context-Aware Dynamic Network for Salient Object Detection in Optical Remote Sensing Images
Huang, Kan
Tian, Chunwei
Lin, Chia-Wen
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[36] Transformer guidance dual-stream network for salient object detection in optical remote sensing images
Zhang, Yi
Guo, Jichang
Yue, Huihui
Yin, Xiangjun
Zheng, Sida
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (24): : 17733 - 17747
[37] Toward Integrity and Detail With Ensemble Learning for Salient Object Detection in Optical Remote-Sensing Images
Liu, Kangjie
Zhang, Borui
Lu, Jiwen
Yan, Haibin
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
[38] Transformer with large convolution kernel decoder network for salient object detection in optical remote sensing images
Dong, Pengwei
Wang, Bo
Cong, Runmin
Sun, Hai-Han
Li, Chongyi
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 240
[39] A lightweight multi-scale context network for salient object detection in optical remote sensing images
Lin, Yuhan
Sun, Han
Liu, Ningzhong
Bian, Yetong
Cen, Jun
Zhou, Huiyu
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 238 - 244
[40] Nested Network With Two-Stream Pyramid for Salient Object Detection in Optical Remote Sensing Images
Li, Chongyi
Cong, Runmin
Hou, Junhui
Zhang, Sanyi
Qian, Yue
Kwong, Sam
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (11): : 9156 - 9166

← 1 2 3 4 5 →