ProtoUDA: Prototype-Based Unsupervised Adaptation for Cross-Domain Text Recognition

被引：0

作者：

Liu, Xiao-Qian ^{[1
]}

Ding, Xue-Ying ^{[1
]}

Luo, Xin ^{[1
]}

Xu, Xin-Shun ^{[1
]}

机构：

[1] Shandong Univ, Sch Software, Jinan 250101, Peoples R China

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2024年 / 36卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Text recognition; Prototypes; Feature extraction; Task analysis; Visualization; Decoding; Adaptation models; Unsupervised learning; prototype; text recognition; contrastive learning; domain adaptation; MODEL; DIFFUSION;

D O I：

10.1109/TKDE.2023.3344761

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text recognition reads from real scene text or handwritten text, facilitating many real-world applications such as driverless cars, visual Q&A, and image-based machine translation. Although impressive results have been achieved in single-domain text recognition, it still suffers from great challenges in cross-domain due to the domain gaps among the synthetic text, the real scene text, and the handwritten text. Existing standard unsupervised domain adaptation (UDA) methods struggle to solve the text recognition task since they view a domain or a text image (containing a character sequence) as a whole, ignoring the subunits that make up the sequence. In the paper, we present a Prototyped-based Unsupervised Domain Adaptation method for text recognition (ProtoUDA), where the class prototypes are computed from the source domain, target domain, and the mixed (source-target) domain, respectively. Technically, ProtoUDA initially extracts pseudo-labeled character features under word-level supervised information. Further, based on these character features, we propose two parallel and complementary modules to perform class-level and instance-level alignment, which explicitly transfer the knowledge learned in the source domain to the target domain. Among them, class-level alignment is to close the distance between the similar source prototypes and target prototypes. The instance-level alignment is based on contrastive learning, making the character instances of the mixed domain close to the corresponding class mixed prototype while staying away from other class mixed prototypes. To our knowledge, we are the first to adopt contrastive learning in UDA-based text recognition tasks. Extensive experiments on several benchmark datasets show the superiority of our method over state-of-the-art methods.

引用

页码：9096 / 9108

页数：13

共 50 条

[41] Prototype-based dual-alignment of multi-source domain adaptation for radar emitter recognition
Wei, Haojie
Fang, Min
Li, Haixiang
Wang, Yinan
Zheng, Zhanpeng
SIGNAL PROCESSING, 2025, 230
[42] Hierarchical Subspace Learning Based Unsupervised Domain Adaptation for Cross-Domain Classification of Remote Sensing Images
Banerjee, Biplab
Chaudhuri, Subhasis
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (11) : 5099 - 5109
[43] Cross-Domain Labeled LDA for Cross-Domain Text Classification
Jing, Baoyu
Lu, Chenwei
Wang, Deqing
Zhuang, Fuzhen
Niu, Cheng
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 187 - 196
[44] A Fast, Efficient Domain Adaptation Technique for Cross-Domain Electroencephalography(EEG)-Based Emotion Recognition
Chai, Xin
Wang, Qisong
Zhao, Yongping
Li, Yongqiang
Liu, Dan
Liu, Xin
Bai, Ou
SENSORS, 2017, 17 (05)
[45] Unsupervised domain adaptation segmentation algorithm with cross-domain data augmentation and category contrast
Dong, Wenyong
Liang, Zhixue
Wang, Liping
Tian, Gang
Long, Qianhui
NEUROCOMPUTING, 2025, 623
[46] STR Transformer: A Cross-domain Transformer for Scene Text Recognition
Xing Wu
Bin Tang
Ming Zhao
Jianjia Wang
Yike Guo
Applied Intelligence, 2023, 53 : 3444 - 3458
[47] STR Transformer: A Cross-domain Transformer for Scene Text Recognition
Wu, Xing
Tang, Bin
Zhao, Ming
Wang, Jianjia
Guo, Yike
APPLIED INTELLIGENCE, 2023, 53 (03) : 3444 - 3458
[48] Discriminative Extreme Learning Machine with Cross-Domain Mean Approximation for Unsupervised Domain Adaptation
Zang, Shaofei
Li, Xinghai
Ma, Jianwei
Yan, Yongyi
Lv, Jinfeng
Wei, Yuan
COMPLEXITY, 2022, 2022
[49] NaCL: noise-robust cross-domain contrastive learning for unsupervised domain adaptation
Li, Jingzheng
Sun, Hailong
MACHINE LEARNING, 2023, 112 (09) : 3473 - 3496
[50] CONNECTING THE DOTS WITHOUT CLUES: UNSUPERVISED DOMAIN ADAPTATION FOR CROSS-DOMAIN VISUAL CLASSIFICATION
Chen, Wei-Yu
Hsu, Tzu-Ming Harry
Hou, Cheng-An
Yeh, Yi-Ren
Wang, Yu-Chiang Frank
2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 3997 - 4001

← 1 2 3 4 5 →