A real-time and effective text detection method for multi-scale and fuzzy text

被引：2

作者：

Tong, Guoxiang ^{[1
]}

Dong, Ming ^{[1
]}

Song, Yan ^{[1
]}

机构：

[1] Univ Shanghai Sci & Technol, Dept Opt Elect & Comp Engn, Shanghai 200093, Peoples R China

来源：

JOURNAL OF REAL-TIME IMAGE PROCESSING | 2023年 / 20卷 / 01期

关键词：

Natural scene text detection; Attention mechanism; Feature path augmentation; CIoU loss; SCENE; ACCURATE;

D O I：

10.1007/s11554-023-01267-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The text in the natural scene can be in various forms, dynamic blur and geometric perspective greatly affect the efficiency of text detection. Given the above situation, a real-time and effective text detection method is proposed to detect the multi-scale and fuzzy text. This method applies a convolutional attention mechanism to the feature extraction backbone to obtain more valuable text feature maps. To fully utilize the precise text location signals of the low-level features, a bottom-up path augmentation is used simultaneously. Besides, a few layers of the Resnet-50 backbone are cancelled to further shorten information communication path for balancing the speed and accuracy of detection. For text detection results, the four vertex coordinate values of the text boxes are regressed with the assistance of CIoU loss and shrinkage of text labels. Our model can effectively process an image in the fastest time of 112 ms and has a higher comprehensive indicator value than the other comparative models in ICDAR 2013, ICDAR 2015, and MSRA-TD500 datasets.

引用

页数：15

共 50 条

[21] Real-Time Scene Text Detection with Differentiable Binarization
Liao, Minghui
Wan, Zhaoyi
Yao, Cong
Chen, Kai
Bai, Xiang
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11474 - 11481
[22] Performance Evaluation of Real-time and Scale-invariant LoG Operators for Text Detection
Dinh Cong Nguyen
Delalandre, Mathieu
Conte, Donatello
The Anh Pham
PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 344 - 353
[23] Real-Time Text Detection with Multi-level Feature Fusion and Pixel Clustering
Xu, Lu
Jiang, Zhufeng
Han, Xingyu
Wang, Hui
Fan, Zizhu
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 16 - 29
[24] Natural scene text detection by multi-scale adaptive color clustering and non-text filtering
Wu, Hui
Zou, Beiji
Zhao, Yu-Qian
Chen, Zailiang
Zhu, Chengzhang
Guo, Jianjing
NEUROCOMPUTING, 2016, 214 : 1011 - 1025
[25] Text Detection Algorithm Based on Multi-Scale Attention Feature Fusion
She, Xiangyang
Liu, Zhe
Dong, Lihong
Computer Engineering and Applications, 2024, 60 (01) : 198 - 206
[26] Multi-scale Information Fusion Combined with Residual Attention for Text Detection
Zhao, Wenxiu
Dongye, Changlei
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 506 - 518
[27] SCENE TEXT DETECTION BASED ON MULTI-SCALE SWT AND EDGE FILTERING
Feng, Yuanyuan
Song, Yonghong
YualinZhang
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 645 - 650
[28] Multi-Scale Scene Text Detection Based on Convolutional Neural Network
Lu, Yan-Feng
Zhang, Ai-Xuan
Li, Yi
Yu, Qian-Hui
Qiao, Hong
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 583 - 587
[29] Real-time scale selection in hybrid multi-scale representations
Lindeberg, T
Bretzner, L
SCALE SPACE METHODS IN COMPUTER VISION, PROCEEDINGS, 2003, 2695 : 148 - 163
[30] Subsampling-based HOG for Multi-scale real-time Pedestrian Detection
Song, Peng-Lei
Zhu, Yan
Zhang, Zhen
Zhang, Jian-Dong
PROCEEDINGS OF THE IEEE 2019 9TH INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) ROBOTICS, AUTOMATION AND MECHATRONICS (RAM) (CIS & RAM 2019), 2019, : 24 - 29

← 1 2 3 4 5 →