A deep learning model for Ottoman OCR

被引:5
|
作者
Dolek, Ishak [1 ]
Kurt, Atakan [1 ]
机构
[1] Istanbul Univ Cerrahpasa, Engn Sch, Comp Engn Dept, Istanbul, Turkey
来源
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2022年 / 34卷 / 20期
关键词
CNN; CTC; deep neural networks; LSTM; OCR; Ottoman; printed naksh font; RNN; NEURAL-NETWORK; RECOGNITION; SEGMENTATION; RETRIEVAL;
D O I
10.1002/cpe.6937
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The Ottoman OCR is an open problem because the OCR models for Arabic do not perform well on Ottoman. The models specifically trained with Ottoman documents have not produced satisfactory results either. We present a deep learning model and an OCR tool using that model for the OCR of printed Ottoman documents in the naksh font. We propose an end-to-end trainable CRNN architecture consisting of CNN, RNN (LSTM), and CTC layers for the Ottoman OCR problem. An experimental comparison of this model, called , with the Tesseract Arabic, the Tesseract Persian, Abby Finereader, Miletos, and Google Docs OCR tools or models was performed using a test data set of 21 pages of original documents. With 88.86% raw text, 96.12% normalized text, and 97.37% joined text character recognition accuracy, the Hybrid model outperforms the others with a marked difference. Our model outperforms the next best model by a clear margin of 4% which is a significant improvement considering the difficulty of the Ottoman OCR problem, and the huge size of the Ottoman archives to be processed. The hybrid model also achieves 58% word recognition accuracy on normalized text which is the only rate above 50%.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] OCR with the Deep CNN Model for Ligature Script-Based Languages like Manchu
    Zhang, Diandian
    Liu, Yan
    Wang, Zhuowei
    Wang, Depei
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [22] Deep Learning-Aided OCR Techniques for Chinese Uppercase Characters in the Application of Internet of Things
    Yin, Yue
    Zhang, Wei
    Hong, Sheng
    Yang, Jie
    Xiong, Jian
    Gui, Guan
    IEEE ACCESS, 2019, 7 : 47043 - 47049
  • [23] Combining Deep Learning and Language Modeling for Segmentation-Free OCR From Raw Pixels
    Rawls, Stephen
    Cao, Huaigu
    Sabir, Ekraam
    Natarajan, Prem
    2017 1ST INTERNATIONAL WORKSHOP ON ARABIC SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2017, : 119 - 123
  • [24] Baidu Meizu Deep Learning Competition: Arithmetic Operation Recognition Using End-to-End Learning OCR Technologies
    Jiang, Yuxiang
    Dong, Haiwei
    El Saddik, Abdulmotaleb
    IEEE ACCESS, 2018, 6 : 60128 - 60136
  • [25] Evaluating OCR and non-OCR text representations for learning document classifiers
    Junker, M
    Hoch, R
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 1060 - 1066
  • [26] OCR-Diff: A Two-Stage Deep Learning Framework for Optical Character Recognition Using Diffusion Model in Industrial Internet of Things
    Park, Chae-Won
    Palakonda, Vikas
    Yun, Sangseok
    Kim, Il-Min
    Kang, Jae-Mo
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (15): : 25997 - 26000
  • [27] Progressive Transmission and Inference of Deep Learning Models deep learning model transmission, deep learning model deployment, deep learning application, progressive transmission, user experience
    Lee, Youngsoo
    Yun, Sangdoo
    Kim, Yeonghun
    Choi, Sunghee
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 271 - 277
  • [28] Greedy Search for Active Learning of OCR
    Agarwal, Arpit
    Garg, Ritu
    Chaudhury, Santanu
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 837 - 841
  • [29] Learning features for predicting OCR accuracy
    Ye, Peng
    Doermann, David
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3204 - 3207
  • [30] Statistical learning for OCR error correction
    Mei, Jie
    Islam, Aminul
    Moh'd, Abidalrahman
    Wu, Yajing
    Milios, Evangelos
    INFORMATION PROCESSING & MANAGEMENT, 2018, 54 (06) : 874 - 887