Detecting Text in the Wild with Deep Character Embedding Network

被引:3
|
作者
Li, Jiaming [1 ]
Zhang, Chengquan [1 ]
Sun, Yipeng [1 ]
Han, Junyu [1 ]
Ding, Errui [1 ]
机构
[1] Baidu Inc, Beijing, Peoples R China
来源
关键词
Text detection; Character detection; Embedding learning;
D O I
10.1007/978-3-030-20870-7_31
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Most text detection methods hypothesize texts are horizontal or multi-oriented and thus define quadrangles as the basic detection unit. However, text in the wild is usually perspectively distorted or curved, which can not be easily tackled by existing approaches. In this paper, we propose a deep character embedding network (CENet) which simultaneously predicts the bounding boxes of characters and their embedding vectors, thus making text detection a simple clustering task in the character embedding space. The proposed method does not require strong assumptions of forming a straight line on general text detection, which provides flexibility on arbitrarily curved or perspectively distorted text. For character detection task, a dense prediction subnetwork is designed to obtain the confidence score and bounding boxes of characters. For character embedding task, a subnet is trained with contrastive loss to project detected characters into embedding space. The two tasks share a backbone CNN from which the multi-scale feature maps are extracted. The final text regions can be easily achieved by a thresholding process on character confidence and embedding distance of character pairs. We evaluated our method on ICDAR13, ICDAR15, MSRA-TD500, and Total Text. The proposed method achieves state-of-the-art or comparable performance on all of the datasets, and shows a substantial improvement in the irregular-text datasets, i.e. Total-Text.
引用
收藏
页码:501 / 517
页数:17
相关论文
共 50 条
  • [41] Discriminative Deep Audio Feature Embedding for Speaker Recognition in the Wild
    Bianco, Simone
    Cereda, Elia
    Napoletano, Paolo
    2018 IEEE 8TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - BERLIN (ICCE-BERLIN), 2018,
  • [42] Multimodal video-text matching using a deep bifurcation network and joint embedding of visual and textual features
    Nabati, Masoomeh
    Behrad, Alireza
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [43] Deep Attributed Network Embedding Based on the PPMI
    Dong, Kunjie
    Huang, Tong
    Zhou, Lihua
    Wang, Lizhen
    Chen, Hongmei
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS: DASFAA 2021 INTERNATIONAL WORKSHOPS, 2021, 12680 : 251 - 266
  • [44] Network Embedding via Deep Prediction Model
    Sun, Xin
    Song, Zenghui
    Yu, Yongbo
    Dong, Junyu
    Plant, Claudia
    Bohm, Christian
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (02) : 455 - 470
  • [45] Semisupervised Network Embedding With Differentiable Deep Quantization
    He, Tao
    Gao, Lianli
    Song, Jingkuan
    Li, Yuan-Fang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4791 - 4802
  • [46] DEEP EMBEDDING NETWORK FOR ROBUST AGE ESTIMATION
    He, Yating
    Huang, Min
    Miao, Qinghai
    Guo, Haiyun
    Wang, Jinqiao
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1092 - 1096
  • [47] Deep Dynamic Network Embedding for Link Prediction
    Li, Taisong
    Zhang, Jiawei
    Yu, Philip S.
    Zhang, Yan
    Yan, Yonghong
    IEEE ACCESS, 2018, 6 : 29219 - 29230
  • [48] Deep Variational Network Embedding in Wasserstein Space
    Zhu, Dingyuan
    Cui, Peng
    Wang, Daixin
    Zhu, Wenwu
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 2827 - 2836
  • [49] DINE: A Framework for Deep Incomplete Network Embedding
    Hou, Ke
    Liu, Jiaying
    Peng, Yin
    Xu, Bo
    Lee, Ivan
    Xia, Feng
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 165 - 176
  • [50] Heterogeneous Network Embedding via Deep Architectures
    Chang, Shiyu
    Han, Wei
    Tang, Jiliang
    Qi, Guo-Jun
    Aggarwal, Charu C.
    Huang, Thomas S.
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 119 - 128