Fine-grained Pseudo Labels for Scene Text Recognition

被引:0
|
作者
Li, Xiaoyu [1 ]
Chen, Xiaoxue [1 ]
Huang, Zuming [1 ]
Xie, Lele [1 ]
Chen, Jingdong [1 ]
Yang, Ming [1 ]
机构
[1] Ant Grp, Hangzhou, Peoples R China
关键词
pseudo labels; domain shift; scene text recognition;
D O I
10.1145/3581783.3611791
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pseudo-Labeling based semi-supervised learning has shown promising advantages in Scene Text Recognition (STR). Most of them usually use a pre-trained model to generate sequence-level pseudo labels for text images and then re-train the model. Recently, conducting Pseudo-Labeling in a teacher-student framework (a student model is supervised by the pseudo labels from a teacher model) has become increasingly popular, which trains in an end-to-end manner and yields outstanding performance in semi-supervised learning. However, applying this framework directly to Pseudo-Labeling STR exhibits unstable convergence, as generating pseudo labels at the coarse-grained sequence-level leads to inefficient utilization of unlabelled data. Furthermore, the inherent domain shift between labeled and unlabeled data results in low quality of derived pseudo labels. To mitigate the above issues, we propose a novel Cross-domain Pseudo-Labeling (CPL) approach for scene text recognition, which makes better utilization of unlabeled data at the character-level and provides more accurate pseudo labels. Specifically, our proposed Pseudo-Labeled Curriculum Learning dynamically adjusts the thresholds for different character classes according to the model's learning status. Moreover, an Adaptive Distribution Regularizer is employed to bridge the domain gap and improve the quality of pseudo labels. Extensive experiments show that CPL boosts those representative STR models to achieve state-of-the-art results on six challenging STR benchmarks. Besides, it can be effectively generalized to handwritten text.
引用
收藏
页码:5786 / 5795
页数:10
相关论文
共 50 条
  • [41] Fine-grained and coarse-grained contrastive learning for text classification
    Zhang, Shaokang
    Ran, Ning
    NEUROCOMPUTING, 2024, 596
  • [42] FenceNet: Fine-grained Footwork Recognition in Fencing
    Zhu, Kevin
    Wong, Alexander
    McPhee, John
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3588 - 3597
  • [43] Fine-grained recognition of plants from images
    Milan Šulc
    Jiří Matas
    Plant Methods, 13
  • [44] Fine-Grained Grounding for Multimodal Speech Recognition
    Srinivasan, Tejas
    Sanabria, Ramon
    Metze, Florian
    Elliott, Desmond
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2667 - 2677
  • [45] Semantic bilinear pooling for fine-grained recognition
    School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
    Proc. Int. Conf. Pattern Recognit., (3660-3666):
  • [46] Semantic Bilinear Pooling for Fine-Grained Recognition
    Li, Xinjie
    Yang, Chun
    Chen, Song-Lu
    Zhu, Chao
    Yin, Xu-Cheng
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3660 - 3666
  • [47] Learning Features and Parts for Fine-Grained Recognition
    Krause, Jonathan
    Gebru, Timnit
    Deng, Jia
    Li, Li-Jia
    Li Fei-Fei
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 26 - 33
  • [48] Fine-grained Activity Recognition in Baseball Videos
    Piergiovanni, A. J.
    Ryoo, Michael S.
    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1821 - 1829
  • [49] Annotation modification for fine-grained visual recognition
    Luo, Changzhi
    Meng, Zhijun
    Feng, Jiashi
    Ni, Bingbing
    Wang, Meng
    NEUROCOMPUTING, 2018, 274 : 58 - 65
  • [50] Leveraging the Wisdom of the Crowd for Fine-Grained Recognition
    Deng, Jia
    Krause, Jonathan
    Stark, Michael
    Fei-Fei, Li
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (04) : 666 - 676