TextMountain: Accurate scene text detection via instance segmentation

被引：66

作者：

Zhu, Yixing ^{[1
]}

Du, Jun ^{[1
]}

机构：

[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Anhui, Peoples R China

来源：

PATTERN RECOGNITION | 2021年 / 110卷

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Scene text detection; Curved text; Multi-oriented text; CNN; Deep learning; RECOGNITION;

D O I：

10.1016/j.patcog.2020.107336

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a novel scene text detection method named TextMountain. The key idea of TextMountain is making full use of border-center information. Different from previous works that treat center-border as a binary classification problem, we predict text center-border probability (TCBP) and text center-direction (TCD). The TCBP is just like a mountain whose top is text center and foot is text border. The mountaintop can separate text instances which cannot be easily achieved using semantic segmentation map and its rising direction can plan a road to top for each pixel on mountain foot at the group stage. The TCD helps TCBP learning better. Our label rules will not lead to the ambiguous problem with the transformation of angle, so the proposed method is robust to multi-oriented text and can also handle well curved text. In inference stage, each pixel at the mountain foot needs to search the path to the mountaintop and this process can be efficiently completed in parallel, yielding the efficiency of our method compared with others. The experiments on MLT, ICDAR2015, RCTW-17 and SCUT-CTW150 0 datasets demonstrate that the proposed method achieves better or comparable performance in terms of both accuracy and efficiency. It is worth mentioning our method achieves an F-measure of 76.85% on MLT which outperforms the previous methods by a large margin. Code will be made available. (c) 2020 Elsevier Ltd. All rights reserved.

引用

页数：11

共 50 条

[1] Scene Text Detection with Recurrent Instance Segmentation
Feng, Wei
He, Wen-Hao
Yin, Fei
Liu, Cheng-Lin
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2227 - 2232
[2] PixelLink: Detecting Scene Text via Instance Segmentation
Deng, Dan
Liu, Haifeng
Li, Xuelong
Cai, Deng
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 6773 - 6780
[3] TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation
Song, Xiaoge
Wu, Yirui
Wang, Wenhai
Lu, Tong
MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 201 - 213
[4] Instance Segmentation Network With Self-Distillation for Scene Text Detection
Yang, Peng
Yang, Guowei
Gong, Xun
Wu, Pingping
Han, Xu
Wu, Jiasong
Chen, Caisen
IEEE ACCESS, 2020, 8 : 45825 - 45836
[5] T-Skeleton: Accurate scene text detection via instance-aware skeleton embedding
Li, Haiyan
Hu, Xingfei
Lu, Hongtao
IET IMAGE PROCESSING, 2024, 18 (06) : 1491 - 1503
[6] CentripetalText: An Efficient Text Instance Representation for Scene Text Detection
Sheng, Tao
Chen, Jie
Lian, Zhouhui
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[7] ACCURATE INSTANCE SEGMENTATION VIA COLLABORATIVE LEARNING
Chen, Tianyou
Hu, Xiaoguang
Xiao, Jin
Zhang, Guofeng
Wang, Shaojie
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1880 - 1884
[8] Arbitrary shape scene text detector with accurate text instance generation based on instance-relevant contexts
Li, Haiyan
Zhang, Yangsong
Bayramli, Bayram
Lu, Hongtao
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (12) : 17827 - 17852
[9] Arbitrary shape scene text detector with accurate text instance generation based on instance-relevant contexts
Haiyan Li
Yangsong Zhang
Bayram Bayramli
Hongtao Lu
Multimedia Tools and Applications, 2023, 82 : 17827 - 17852
[10] Scene Text Segmentation via Inverse Rendering
Zhou, Yahan
Feild, Jacqueline
Learned-Miller, Erik
Wang, Rui
2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 457 - 461

← 1 2 3 4 5 →