DiZNet: An end-to-end text detection and recognition algorithm with detail in text zone

被引:0
|
作者
Zhou, Di [1 ]
Zhang, Jianxun [1 ]
Li, Chao [2 ]
机构
[1] Chongqing Univ Technol, Dept Comp Sci & Engn, Chongqing, Peoples R China
[2] Jingdezhen Ceram Univ, Jingdezhen, Jiangxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene Text Spotting; Text Attention; Feature Pyramid Enhancement Fusion; Text Detail Map Representation;
D O I
10.1016/j.jvcir.2024.104261
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposed an efficient and novel end-to-end text detection and recognition framework called DiZNet. DiZNet is built upon a core representation using text detail maps and employs the classical lightweight ResNet18 as the backbone for the text detection and recognition algorithm model. The redesigned Text Attention Head (TAH) takes multiple shallow backbone features as input, effectively extracting pixel-level information of text in images and global text positional features. The extracted text features are integrated into the stackable Feature Pyramid Enhancement Fusion Module (FPEFM). Supervised with text detail map labels, which include boundary information and texture of important text, the model predicts text detail maps and fuses them into the text detection and recognition heads. Through end-to-end testing on publicly available natural scene text benchmark datasets, our approach demonstrates robust generalization capabilities and real-time detection speeds. Leveraging the advantages of text detail map representation, DiZNet achieves a good balance between precision and efficiency on challenging datasets. For example, DiZNet achieves 91.2% Precision and 85.9% F-measure at a speed of 38.4 FPS on Total-Text and 83.8% F-measure at a speed of 30.0 FPS on ICDAR2015, it attains 83.8% F-measure at 30.0 FPS. The code is publicly available at: https://github.com/DiZ-gogogo/DiZNet
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An End-to-End Scene Text Recognition for Bilingual Text
    Albalawi, Bayan M.
    Jamal, Amani T.
    Al Khuzayem, Lama A.
    Alsaedi, Olaa A.
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (09)
  • [2] END-TO-END CHINESE TEXT RECOGNITION
    Hu, Jie
    Guo, Tszhang
    Cao, Ji
    Zhang, Changshui
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 1407 - 1411
  • [3] End-to-End Scene Text Recognition
    Wang, Kai
    Babenko, Boris
    Belongie, Serge
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 1457 - 1464
  • [4] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
    Hao, Jiedong
    Wen, Yafei
    Deng, Jie
    Gan, Jun
    Ren, Shuai
    Tan, Hui
    Chen, Xiaoxin
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108
  • [5] End-to-End Analysis for Text Detection and Recognition in Natural Scene Images
    Alnefaie, Ahlam
    Gupta, Deepak
    Bhuyan, Monowar H.
    Razzak, Imran
    Gupta, Prashant
    Prasad, Mukesh
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [6] End-to-end Speech-to-Punctuated-Text Recognition
    Nozaki, Jumon
    Kawahara, Tatsuya
    Ishizuka, Kenkichi
    Hashimoto, Taiichi
    INTERSPEECH 2022, 2022, : 1811 - 1815
  • [7] End-to-End Text Recognition with Convolutional Neural Networks
    Wang, Tao
    Wu, David J.
    Coates, Adam
    Ng, Andrew Y.
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3304 - 3308
  • [8] Speech-and-Text Transformer: Exploiting Unpaired Text for End-to-End Speech Recognition
    Wang, Qinyi
    Zhou, Xinyuan
    Li, Haizhou
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2023, 12 (01)
  • [9] Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion
    Makhmudov, Fazliddin
    Mukhiddinov, Mukhriddin
    Abdusalomov, Akmalbek
    Avazov, Kuldoshbay
    Khamdamov, Utkir
    Cho, Young Im
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2020, 18 (06)
  • [10] An end-to-end text spotter with text relation networks
    Jianguo Jiang
    Baole Wei
    Min Yu
    Gang Li
    Boquan Li
    Chao Liu
    Min Li
    Weiqing Huang
    Cybersecurity, 4