CrossMatch: Source-Free Domain Adaptive Semantic Segmentation via Cross-Modal Consistency Training

被引：1

作者：

Yin, Yifang ^{[1
]}

Hu, Wenmiao ^{[2
,4
]}

Liu, Zhenguang ^{[3
]}

Wang, Guanfeng ^{[4
]}

Xiang, Shili ^{[1
]}

Zimmermann, Roger ^{[2
]}

机构：

[1] ASTAR, Inst Infocomm Res, Singapore, Singapore

[2] Natl Univ Singapore, Singapore, Singapore

[3] Zhejiang Gongshang Univ, Hangzhou, Peoples R China

[4] Grabtaxi Holdings Pte Ltd, Singapore, Singapore

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

关键词：

D O I：

10.1109/ICCV51070.2023.01991

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Source-free domain adaptive semantic segmentation has gained increasing attention recently. It eases the requirement of full access to the source domain by transferring knowledge only from a well-trained source model. However, reducing the uncertainty of the target pseudo labels becomes inevitably more challenging without the supervision of the labeled source data. In this work, we propose a novel asymmetric two-stream architecture that learns more robustly from noisy pseudo labels. Our approach simultaneously conducts dual-head pseudo label denoising and cross-modal consistency regularization. Towards the former, we introduce a multimodal auxiliary network during training (and discard it during inference), which effectively enhances the pseudo labels' correctness by leveraging the guidance from the depth information. Towards the latter, we enforce a new cross-modal pixel-wise consistency between the predictions of the two streams, encouraging our model to behave smoothly for both modality variance and image perturbations. It serves as an effective regularization to further reduce the impact of the inaccurate pseudo labels in source-free unsupervised domain adaptation. Experiments on GTA5. Cityscapes and SYNTHIA. Cityscapes benchmarks demonstrate the superiority of our proposed method, obtaining the new state-of-the-art mIoU of 57.7% and 57.5%, respectively.

引用

页码：21729 / 21739

页数：11

共 50 条

[41] Semantic consistency cross-modal dictionary learning with rank constraint
Shang, Fei
Zhang, Huaxiang
Sun, Jiande
Liu, Li
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 62 : 259 - 266
[42] Enhancing source-free domain adaptation in Medical Image Segmentation via regulated model self-training
Zhang, Tianwei
Li, Kang
Gu, Shi
Heng, Pheng-Ann
MEDICAL IMAGE ANALYSIS, 2025, 102
[43] Source-Free Domain Adaptive Fundus Image Segmentation with Denoised Pseudo-Labeling
Chen, Cheng
Liu, Quande
Jin, Yueming
Dou, Qi
Heng, Pheng-Ann
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT V, 2021, 12905 : 225 - 235
[44] Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning
Zhang, Ziyi
Chen, Weikai
Cheng, Hui
Li, Zhen
Li, Siyuan
Lin, Liang
Li, Guanbin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[45] Cross-modal Unsupervised Domain Adaptation for 3D Semantic Segmentation via Bidirectional Fusion-then-Distillation
Wu, Yao
Xing, Mingwei
Zhang, Yachao
Xie, Yuan
Fan, Jianping
Shi, Zhongchao
Qu, Yanyun
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 490 - 498
[46] Cross-Modal Recurrent Semantic Comprehension for Referring Image Segmentation
Shang, Chao
Li, Hongliang
Qiu, Heqian
Wu, Qingbo
Meng, Fanman
Zhao, Taijin
Ngan, King Ngi
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (07) : 3229 - 3242
[47] HR and LiDAR Data Collaborative Semantic Segmentation Based on Adaptive Cross-Modal Fusion Network
Ye, Zhen
Li, Zhen
Wang, Nan
Li, Yuan
Li, Wei
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 12153 - 12168
[48] Source-free Domain Adaptive Human Pose Estimation
Peng, Qucheng
Zheng, Ce
Chen, Chen
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 4803 - 4813
[49] Image Tagging via Cross-Modal Semantic Mapping
Deng, Zhi-Hong
Yu, Hongliang
Yang, Yunlun
MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1143 - 1146
[50] Image-Text Retrieval With Cross-Modal Semantic Importance Consistency
Liu, Zejun
Chen, Fanglin
Xu, Jun
Pei, Wenjie
Lu, Guangming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2465 - 2476

← 1 2 3 4 5 →