A real-time and effective text detection method for multi-scale and fuzzy text

被引:2
|
作者
Tong, Guoxiang [1 ]
Dong, Ming [1 ]
Song, Yan [1 ]
机构
[1] Univ Shanghai Sci & Technol, Dept Opt Elect & Comp Engn, Shanghai 200093, Peoples R China
关键词
Natural scene text detection; Attention mechanism; Feature path augmentation; CIoU loss; SCENE; ACCURATE;
D O I
10.1007/s11554-023-01267-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The text in the natural scene can be in various forms, dynamic blur and geometric perspective greatly affect the efficiency of text detection. Given the above situation, a real-time and effective text detection method is proposed to detect the multi-scale and fuzzy text. This method applies a convolutional attention mechanism to the feature extraction backbone to obtain more valuable text feature maps. To fully utilize the precise text location signals of the low-level features, a bottom-up path augmentation is used simultaneously. Besides, a few layers of the Resnet-50 backbone are cancelled to further shorten information communication path for balancing the speed and accuracy of detection. For text detection results, the four vertex coordinate values of the text boxes are regressed with the assistance of CIoU loss and shrinkage of text labels. Our model can effectively process an image in the fastest time of 112 ms and has a higher comprehensive indicator value than the other comparative models in ICDAR 2013, ICDAR 2015, and MSRA-TD500 datasets.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] MTMFNet: multi-threshold and multi-scale feature fusion network for text detection
    Dai, Lei
    Gao, Wen
    Tang, Chengyu
    Wang, Min
    Chen, Zhihua
    VISUAL COMPUTER, 2025,
  • [42] MS-DETR: a real-time multi-scale detection transformer for PCB defect detection
    Ji, Li
    Huang, Chaohang
    Li, Haiwei
    Han, Wenjie
    Yi, Leiye
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (03)
  • [43] A Real-Time Text Analysis System
    Chi Mai Nguyen
    Phat Trien Thai
    Duy Khang Lam
    Van Tuan Nguyen
    2023 IEEE 47TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC, 2023, : 340 - 345
  • [44] REAL-TIME VOICE TO TEXT WITH STENOGRAPHERS
    OAKEY, JE
    JOURNAL OF MICROCOMPUTER APPLICATIONS, 1993, 16 (03): : 271 - 276
  • [45] MULTI-SCALE VIDEO TEXT DETECTION BASED ON CORNER AND STROKE WIDTH VERIFICATION
    Zhang, Boyu
    Liu, JiaFeng
    Tang, XiangLong
    2013 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP 2013), 2013,
  • [46] A Novel Multi-scale Deep Neural Framework for Script Invariant Text Detection
    Tauseef Khan
    Ayatullah Faruk Mollah
    Neural Processing Letters, 2022, 54 : 1371 - 1397
  • [47] Scene Text Detection Based on Multi-Scale Pooling and Bidirectional Feature Fusion
    Wei, Zheliang
    Li, Yueyang
    Luo, Haichi
    Computer Engineering and Applications, 2024, 60 (02) : 154 - 161
  • [48] Arbitrary shape text detection fusing InceptionNeXt and multi-scale attention mechanism
    Li, Xianguo
    Zhang, Yu
    Liu, Yi
    Yao, Xingchen
    Zhou, Xinyi
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (17): : 25484 - 25509
  • [49] A Novel Multi-scale Deep Neural Framework for Script Invariant Text Detection
    Khan, Tauseef
    Mollah, Ayatullah Faruk
    NEURAL PROCESSING LETTERS, 2022, 54 (02) : 1371 - 1397
  • [50] Multi-Scale Feature Aggregation for Rumor Detection: Unveiling the Truth within Text
    Wu, Jianming
    Chen, ShuHong
    Wang, Guojun
    Wang, Hao
    Li, Hanjun
    2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 1086 - 1093