Enhancing optical character recognition: Efficient techniques for document layout analysis and text line detection

被引:9
|
作者
Fateh, Amirreza [1 ]
Fateh, Mansoor [2 ]
Abolghasemi, Vahid [3 ]
机构
[1] Iran Univ Sci & Technol IUST, Sch Comp Engn, Tehran, Iran
[2] Shahrood Univ Technol, Fac Comp Engn, Shahrud, Iran
[3] Univ Essex, Sch Comp Sci & Elect Engn, Colchester, England
关键词
connected component; document layout analysis; font size; line detection; Persian printed;
D O I
10.1002/eng2.12832
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent years, automatic document and text analysis has gained significant importance, driven by advancements in optical character recognition (OCR) technology and the need for efficient processing of large volumes of printed or handwritten documents. This article specifically focuses on document layout analysis (DLA) and text line detection (TLD), both of which are crucial components of OCR systems. Our objective is to develop an effective method for extracting both textual and non-textual regions, addressing challenges unique to the Persian (and Persian-like) language(s). In the DLA stage, we employ deep learning models and a voting system to accurately determine the regions of interest. Additionally, we introduce methods such as optimum font size concepts, angle correction, and a line curvature elimination algorithm in the TLD process to enhance OCR accuracy. Comparative evaluations against state-of-the-art methods demonstrate the superiority of our approach, showcasing a 2.8% improvement in the accuracy of Tesseract-OCR 5.1.0 (a well-established commercial OCR system) on the official Iranian newspapers dataset. These findings underscore the importance of addressing DLA and TLD challenges to advance OCR technology for Persian language documents and provide a solid foundation for future research in this domain. Our proposed method introduces several key novelties that contribute to the advancement of optical character recognition (OCR) systems. We collected and presented a valuable dataset for training and evaluating OCR models. Our proposed method successfully addresses challenges associated with document layout analysis (DLA) and text line detection in OCR systems, particularly for the Persian language. We significantly improve the accuracy of OCR systems by employing deep learning models in the DLA stage and implementing a voting system, as well as introducing angle correction methods, optimum font size concepts, and an efficient algorithm to eliminate line curvature.image
引用
收藏
页数:26
相关论文
共 50 条
  • [21] A Survey on various Optical Character Recognition Techniques
    Sabu, Abin M.
    Das, Anto Sahaya
    2018 CONFERENCE ON EMERGING DEVICES AND SMART SYSTEMS (ICEDSS), 2018, : 152 - 155
  • [22] Visual Detection with Context for Document Layout Analysis
    Soto, Carlos X.
    Yoo, Shinjae
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3464 - 3470
  • [23] Rule-based middle-level character detection for simplifying Thai document layout analysis
    Yingsaeree, C
    Kawtrakul, A
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 888 - 892
  • [24] Text-line examination for document forgery detection
    Joost van Beusekom
    Faisal Shafait
    Thomas M. Breuel
    International Journal on Document Analysis and Recognition (IJDAR), 2013, 16 : 189 - 207
  • [25] Text-line examination for document forgery detection
    van Beusekom, Joost
    Shafait, Faisal
    Breuel, Thomas M.
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2013, 16 (02) : 189 - 207
  • [26] Efficient Scene Text Localization and Recognition with Local Character Refinement
    Neumann, Lukas
    Matas, Jiri
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 746 - 750
  • [27] Separation of Text and Non-text in Document Layout Analysis using a Recursive Filter
    Tuan-Anh Tran
    Na, In-Seop
    Kim, Soo-Hyung
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2015, 9 (10): : 4072 - 4091
  • [28] A Modular Region and Text Line Layout Analysis System
    Kiessling, Benjamin
    2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 313 - 318
  • [29] On the Use of Neural Text Generation for the Task of Optical Character Recognition
    Mohammadi, Mahnaz
    Jaf, Sardar
    McGough, Andrew Stephen
    Breckon, Toby P.
    Matthews, Peter
    Theodoropoulos, Georgios
    Obara, Boguslaw
    2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,
  • [30] Optical Character Recognition for printed Tamil text using Unicode
    Seethalakshmi R.
    Sreeranjani T.R.
    Balachandar T.
    Singh A.
    Singh M.
    Ratan R.
    Kumar S.
    Journal of Zhejiang University-SCIENCE A, 2005, 6 (11): : 1297 - 1305