Enhancing optical character recognition: Efficient techniques for document layout analysis and text line detection

被引:9
|
作者
Fateh, Amirreza [1 ]
Fateh, Mansoor [2 ]
Abolghasemi, Vahid [3 ]
机构
[1] Iran Univ Sci & Technol IUST, Sch Comp Engn, Tehran, Iran
[2] Shahrood Univ Technol, Fac Comp Engn, Shahrud, Iran
[3] Univ Essex, Sch Comp Sci & Elect Engn, Colchester, England
关键词
connected component; document layout analysis; font size; line detection; Persian printed;
D O I
10.1002/eng2.12832
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent years, automatic document and text analysis has gained significant importance, driven by advancements in optical character recognition (OCR) technology and the need for efficient processing of large volumes of printed or handwritten documents. This article specifically focuses on document layout analysis (DLA) and text line detection (TLD), both of which are crucial components of OCR systems. Our objective is to develop an effective method for extracting both textual and non-textual regions, addressing challenges unique to the Persian (and Persian-like) language(s). In the DLA stage, we employ deep learning models and a voting system to accurately determine the regions of interest. Additionally, we introduce methods such as optimum font size concepts, angle correction, and a line curvature elimination algorithm in the TLD process to enhance OCR accuracy. Comparative evaluations against state-of-the-art methods demonstrate the superiority of our approach, showcasing a 2.8% improvement in the accuracy of Tesseract-OCR 5.1.0 (a well-established commercial OCR system) on the official Iranian newspapers dataset. These findings underscore the importance of addressing DLA and TLD challenges to advance OCR technology for Persian language documents and provide a solid foundation for future research in this domain. Our proposed method introduces several key novelties that contribute to the advancement of optical character recognition (OCR) systems. We collected and presented a valuable dataset for training and evaluating OCR models. Our proposed method successfully addresses challenges associated with document layout analysis (DLA) and text line detection in OCR systems, particularly for the Persian language. We significantly improve the accuracy of OCR systems by employing deep learning models in the DLA stage and implementing a voting system, as well as introducing angle correction methods, optimum font size concepts, and an efficient algorithm to eliminate line curvature.image
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Joint Layout Analysis, Character Detection and Recognition for Historical Document Digitization
    Ma, Weihong
    Zhang, Hesuo
    Jin, Lianwen
    Wu, Sihang
    Wang, Jiapeng
    Wang, Yongpan
    2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 31 - 36
  • [2] Optical Character Recognition for Scene Text Detection, Mining and Recognition
    Nathiya, N.
    Pradeepa, K.
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2013, : 662 - 665
  • [3] Video text detection and segmentation for optical character recognition
    Ngo, CW
    Chan, CK
    MULTIMEDIA SYSTEMS, 2005, 10 (03) : 261 - 272
  • [4] Video text detection and segmentation for optical character recognition
    Chong-Wah Ngo
    Chi-Kwong Chan
    Multimedia Systems, 2005, 10 : 261 - 272
  • [5] Enhancing Optical Character Recognition on Images with Mixed Text Using Semantic Segmentation
    Patil, Shruti
    Varadarajan, Vijayakumar
    Mahadevkar, Supriya
    Athawade, Rohan
    Maheshwari, Lakhan
    Kumbhare, Shrushti
    Garg, Yash
    Dharrao, Deepak
    Kamat, Pooja
    Kotecha, Ketan
    JOURNAL OF SENSOR AND ACTUATOR NETWORKS, 2022, 11 (04)
  • [6] From object detection to text detection and recognition: A brief evolution history of optical character recognition
    Wang, Haifeng
    Pan, Changzai
    Guo, Xiao
    Ji, Chunlin
    Deng, Ke
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2021, 13 (05)
  • [7] Single-Line Text Detection in Multi-Line Text with Narrow Spacing for Line-Based Character Recognition
    Leow, Chee Siang
    Yajima, Hideaki
    Kitagawa, Tomoki
    Nishizaki, Hiromitsu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (12) : 2097 - 2106
  • [8] Optical Character Recognition For Handwritten Forms With Dynamic Layout
    Arora, Anushri
    Chandratre, Aniruddh
    PROCEEDINGS OF THE 2018 4TH INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT - 2018), 2018, : 299 - 303
  • [9] Hearthstone Helper - Using Optical Character Recognition Techniques for Cards Detection
    Chiru, Costin-Gabriel
    Oprea, Florin
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2016, 2016, 9883 : 192 - 201
  • [10] Optical character recognition of arabic printed text
    Electrical and Electronics Engineering Department, University of Khartoum, Sudan
    SCOReD - IEEE Stud. Conf. Res. Dev., (235-240):