Balanced Synthetic Data for Accurate Scene Text Spotting

被引:0
|
作者
Yao, Ying [1 ]
Huang, Zhangjin [2 ]
机构
[1] Univ Sci & Technol China, Sch Software Engn, Hefei 230051, Anhui, Peoples R China
[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
关键词
synthesize and balance; text detection; text recognition; neural networks;
D O I
10.1117/12.2503258
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Previous approaches for scene text detection or recognition have already achieved promising performances across various benchmarks. There are a lot of superior neural network models to choose from to train the desired classifiers. Besides concentrating on designing loss functions and neural network architectures, number and quality of dataset are key to using neural networks. In this paper we propose a new method for synthesizing text in natural scene images that takes into account data balance. For each image we obtain regions normal based on depth and regions information. After choosing a text from text resource, we blend the text in the original image by using the homography matrix of original region contours and mask contours where we put text directly in. Especially, the text source is obtained by a specific loss function which reflects the distances of current characters' distribution and target characters' distribution. Text detection experiments on standard dataset ICDAR2015 and augmented dataset demonstrate that our method of balanced synthetic dataset gets an 84.5% F-score which achieves 2% increase than the result of standard dataset and is also higher than synthetic dataset without balance. Training on balanced synthetic datasets achieves great improvement of text recognition than on some public standard recognition datasets and also performs better than synthetic datasets without balance.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Fast and Accurate Text Detection in Natural Scene Images
    Xiao, Chengqiu
    Ji, Lixin
    Gao, Chao
    Li, Shaomei
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: IMAGE AND VIDEO DATA ENGINEERING, ISCIDE 2015, PT I, 2015, 9242 : 1 - 10
  • [32] Aggregating Local Context for Accurate Scene Text Detection
    He, Dafang
    Yang, Xiao
    Huang, Wenyi
    Zhou, Zihan
    Kifer, Daniel
    Giles, C. Lee
    COMPUTER VISION - ACCV 2016, PT V, 2017, 10115 : 280 - 296
  • [33] Geometry Normalization Networks for Accurate Scene Text Detection
    Xu, Youjiang
    Duan, Jiaqi
    Kuang, Zhanghui
    Yue, Xiaoyu
    Sun, Hongbin
    Guan, Yue
    Zhang, Wayne
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9136 - 9145
  • [34] Data Augmentation for Scene Text Recognition
    Atienza, Rowel
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1561 - 1570
  • [35] DPGS: Cross-cooperation guided dynamic points generation for scene text spotting
    Sun, Wei
    Wang, Qianzhou
    Hou, Zhiqiang
    Chen, Xueling
    Yan, Qingsen
    Zhang, Yanning
    KNOWLEDGE-BASED SYSTEMS, 2024, 302
  • [36] ABINet plus plus : Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting
    Fang, Shancheng
    Mao, Zhendong
    Xie, Hongtao
    Wang, Yuxin
    Yan, Chenggang
    Zhang, Yongdong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7123 - 7141
  • [37] MGN-Net: Multigranularity Graph Fusion Network in Multimodal for Scene Text Spotting
    Yuan, Zhengyi
    Shi, Cao
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (14): : 25088 - 25098
  • [38] MOSTL: An Accurate Multi-Oriented Scene Text Localization
    Fatemeh Naiemi
    Vahid Ghods
    Hassan Khalesi
    Circuits, Systems, and Signal Processing, 2021, 40 : 4452 - 4473
  • [39] TextMountain: Accurate scene text detection via instance segmentation
    Zhu, Yixing
    Du, Jun
    PATTERN RECOGNITION, 2021, 110
  • [40] MOSTL: An Accurate Multi-Oriented Scene Text Localization
    Naiemi, Fatemeh
    Ghods, Vahid
    Khalesi, Hassan
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2021, 40 (09) : 4452 - 4473