A Historical Handwritten Dataset for Ethiopic OCR with Baseline Models and Human-Level Performance

被引：0

作者：

Belay, Birhanu Hailu ^{[1
]}

Guyon, Isabelle ^{[1
,2
,3
]}

Mengiste, Tadele ^{[4
]}

Tilahun, Bezawork ^{[4
]}

Liwicki, Marcus ^{[5
]}

Tegegne, Tesfa ^{[4
]}

Egele, Romain ^{[1
]}

机构：

[1] Univ Paris Saclay, LISN, Gif Sur Yvette, France

[2] Google Brain, Mountain View, CA USA

[3] ChaLearn, Berkeley, CA USA

[4] Bahir Dar Univ, Bahir Dar, Ethiopia

[5] Lulea Univ Technol, Lulea, Sweden

来源：

DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT III | 2024年 / 14806卷

关键词：

Historical Ethiopic script; Human-level recognition performance; HHD-Ethiopic; Normalized edit distance; Text recognition; TEXT RECOGNITION;

D O I：

10.1007/978-3-031-70543-4_2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces a new OCR dataset for historical handwritten Ethiopic script, characterized by a unique syllabic writing system, low-resource availability, and complex orthographic diacritics. The dataset consists of roughly 80,000 annotated text-line images from 1700 pages of 18(th) to 20(th) century documents, including a training set with text-line images from the 19(th) to 20(th) century and two test sets. One is distributed similarly to the training set with nearly 6,000 text-line images, and the other contains only images from the 18(th) century manuscripts, with around 16,000 images. The former test set allows us to check baseline performance in the classical IID setting (Independently and Identically Distributed), while the latter addresses a more realistic setting in which the test set is drawn from a different distribution than the training set (Out-Of-Distribution or OOD). Multiple annotators labeled all text-line images for the HHD-Ethiopic dataset, and an expert supervisor double-checked them. We assessed human-level recognition performance and compared it with state-of-the-art (SOTA) OCR models using the Character Error Rate (CER) and Normalized Edit Distance (NED) metrics. Our results show that the model performed comparably to human-level recognition on the 18(th) century test set and outperformed humans on the IID test set. However, the unique challenges posed by the Ethiopic script, such as detecting complex diacritics, still present difficulties for the models. Our baseline evaluation and dataset will encourage further research on Ethiopic script recognition. The dataset and source code can be accessed at https://github.com/bdu-birhanu/HHD-Ethiopic.

引用

页码：23 / 38

页数：16

共 39 条

[1] Surpassing Human-Level Face Verification Performance on LFW with GaussianFace
Lu, Chaochao
Tang, Xiaoou
PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3811 - 3819
[2] DeepFace: Closing the Gap to Human-Level Performance in Face Verification
Taigman, Yaniv
Yang, Ming
Ranzato, Marc'Aurelio
Wolf, Lior
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1701 - 1708
[3] Human-level saccade detection performance using deep neural networks
Bellet, Marie E.
Bellet, Joachim
Nienborg, Hendrikje
Hafed, Ziad M.
Berens, Philipp
JOURNAL OF NEUROPHYSIOLOGY, 2019, 121 (02) : 646 - 661
[4] Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1026 - 1034
[5] Large Language Models are Not Yet Human-Level Evaluators for Abstractive Summarization
Shen, Chenhui
Cheng, Liying
Xuan-Phi Nguyen
You, Yang
Bing, Lidong
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 4215 - 4233
[6] Deep rule-based classifier with human-level performance and characteristics
Angelov, Plamen P.
Gu, Xiaowei
INFORMATION SCIENCES, 2018, 463 : 196 - 213
[7] SHMnet: Condition assessment of bolted connection with beyond human-level performance
Zhang, Tong
Biswal, Suryakanta
Wang, Ying
STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2020, 19 (04): : 1188 - 1201
[8] Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models
McClelland, James L.
Hill, Felix
Rudolph, Maja
Baldridge, Jason
Schutze, Hinrich
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (42) : 25966 - 25974
[9] Automated cytometric gating with human-level performance using bivariate segmentation
Chen, Jiong
Ionita, Matei
Feng, Yanbo
Lu, Yinfeng
Orzechowski, Patryk
Garai, Sumita
Hassinger, Kenneth
Bao, Jingxuan
Wen, Junhao
Duong-Tran, Duy
Wagenaar, Joost
Mckeague, Michelle L.
Painter, Mark M.
Mathew, Divij
Pattekar, Ajinkya
Meyer, Nuala J.
Wherry, E. John
Greenplate, Allison R.
Shen, Li
NATURE COMMUNICATIONS, 2025, 16 (01)
[10] Human-level play in the game of Diplomacy by combining language models with strategic reasoning
Bakhtin, Anton
Brown, Noam
Dinan, Emily
Farina, Gabriele
Flaherty, Colin
Fried, Daniel
Goff, Andrew
Gray, Jonathan
Hu, Hengyuan
Jacob, Athul Paul
Komeili, Mojtaba
Konath, Karthik
Kwon, Minae
Lerer, Adam
Lewis, Mike
Miller, Alexander H.
Mitts, Sasha
Renduchintala, Adithya
Roller, Stephen
Rowe, Dirk
Shi, Weiyan
Spisak, Joe
Wei, Alexander
Wu, David
Zhang, Hugh
Zijlstra, Markus
SCIENCE, 2022, 378 (6624) : 1067 - +

← 1 2 3 4 →