Enhancing optical character recognition: Efficient techniques for document layout analysis and text line detection

被引：9

作者：

Fateh, Amirreza ^{[1
]}

Fateh, Mansoor ^{[2
]}

Abolghasemi, Vahid ^{[3
]}

机构：

[1] Iran Univ Sci & Technol IUST, Sch Comp Engn, Tehran, Iran

[2] Shahrood Univ Technol, Fac Comp Engn, Shahrud, Iran

[3] Univ Essex, Sch Comp Sci & Elect Engn, Colchester, England

来源：

ENGINEERING REPORTS | 2024年 / 6卷 / 09期

关键词：

connected component; document layout analysis; font size; line detection; Persian printed;

D O I：

10.1002/eng2.12832

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In recent years, automatic document and text analysis has gained significant importance, driven by advancements in optical character recognition (OCR) technology and the need for efficient processing of large volumes of printed or handwritten documents. This article specifically focuses on document layout analysis (DLA) and text line detection (TLD), both of which are crucial components of OCR systems. Our objective is to develop an effective method for extracting both textual and non-textual regions, addressing challenges unique to the Persian (and Persian-like) language(s). In the DLA stage, we employ deep learning models and a voting system to accurately determine the regions of interest. Additionally, we introduce methods such as optimum font size concepts, angle correction, and a line curvature elimination algorithm in the TLD process to enhance OCR accuracy. Comparative evaluations against state-of-the-art methods demonstrate the superiority of our approach, showcasing a 2.8% improvement in the accuracy of Tesseract-OCR 5.1.0 (a well-established commercial OCR system) on the official Iranian newspapers dataset. These findings underscore the importance of addressing DLA and TLD challenges to advance OCR technology for Persian language documents and provide a solid foundation for future research in this domain. Our proposed method introduces several key novelties that contribute to the advancement of optical character recognition (OCR) systems. We collected and presented a valuable dataset for training and evaluating OCR models. Our proposed method successfully addresses challenges associated with document layout analysis (DLA) and text line detection in OCR systems, particularly for the Persian language. We significantly improve the accuracy of OCR systems by employing deep learning models in the DLA stage and implementing a voting system, as well as introducing angle correction methods, optimum font size concepts, and an efficient algorithm to eliminate line curvature.image

引用

页数：26

共 50 条

[41] An Efficient FPGA Implementation of Optical Character Recognition for License Plate Recognition
Jing, Yuan
Youssefi, Bahar
Mirhassani, Mitra
Muscedere, Roberto
2017 IEEE 30TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2017,
[42] Optical Chinese character recognition for low-quality document images
Chou, TR
Chang, F
PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 608 - 611
[43] Fast optical character recognition through glyph hashing for document conversion
Chellapilla, K
Simard, P
Nickolov, R
EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 829 - 833
[44] Automated Text Detection and Character Recognition in Natural Scenes Based on Local Image Features and Contour Processing Techniques
Baran, Remigiusz
Partila, Pavol
Wilk, Rafal
INTELLIGENT HUMAN SYSTEMS INTEGRATION, IHSI 2018, 2018, 722 : 42 - 48
[45] Recognition of Hand written and Printed Text of Cursive Writing Utilizing Optical Character Recognition
Duth, Sudharshan P.
Amulya, B.
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 576 - 581
[46] Ancient Document Analysis Based on Text Line Extraction
Kleber, Florian
Sablatnig, Robert
Gau, Melanie
Miklas, Heinz
19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1893 - +
[47] Segmentation-free optical character recognition for printed Urdu text
Din, Israr Ud
Siddiqi, Imran
Khalid, Shehzad
Azam, Tahir
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2017,
[48] Segmentation-free optical character recognition for printed Urdu text
Israr Ud Din
Imran Siddiqi
Shehzad Khalid
Tahir Azam
EURASIP Journal on Image and Video Processing, 2017
[49] Optical Character Recognition and text cleaning in the indigenous South African languages
Prinsloo, Danie J.
Taljard, Elsabe
Goosen, Michelle
STELLENBOSCH PAPERS IN LINGUISTICS PLUS-SPIL PLUS, 2022, 64 : 165 - 187
[50] A proposed approach for character recognition using Document Analysis with OCR
Singh, Harneet
Sachan, Anmol
PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 190 - 195

← 1 2 3 4 5 →