Clustering of Farsi Sub-word Images for Whole-book Recognition

被引：0

作者：

Soheili, Mohammad Reza ^{[1
,2
]}

Kabir, Ehsanollah ^{[1
]}

Stricker, Didier ^{[2
]}

机构：

[1] Tarbiat Modares Univ, Dept Elect & Comp Engn, Tehran, Iran

[2] German Res Ctr Artificial Intelligence, Kaiserslautern, Germany

来源：

DOCUMENT RECOGNITION AND RETRIEVAL XXII | 2015年 / 9402卷

关键词：

document image analysis; sub-word image; incremental clustering; shape matching; large document; Persian;

D O I：

10.1117/12.2075931

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a sub-word image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.

引用

页数：12

共 27 条

[21] MAP and Sub-Word Level T-Norm for Text-Dependent Speaker Recognition
Toledano, Doroteo T.
Hernandez-Lopez, Daniel
Esteve-Elizalde, Cristina
Gonzalez-Rodriguez, Joaquin
Fernandez Pozo, Ruben
Hernandez Gomez, Luis
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1933 - +
[22] Merging Clustering and Classification Results for Whole Book Recognition
Soheili, Mohammad Reza
Yousefi, Mohammad Reza
Kabir, Ehsanollah
Stricker, Didier
2017 10TH IRANIAN CONFERENCE ON MACHINE VISION AND IMAGE PROCESSING (MVIP), 2017, : 134 - 138
[23] Sub-Word Unit based Non-Audible Speech Recognition using Surface Electromyography
Walliczek, Matthias
Kraft, Florian
Jou, Szu-Chen
Schultz, Tanja
Waibel, Alex
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1487 - +
[24] Sub-word Based End-to-End Speech Recognition for an Under-Resourced Language: Amharic
Gebreegziabher, Nirayo Hailu
Nuernberger, Andreas
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3466 - 3470
[25] Combining multiple-sized sub-word units in a speech recognition system using baseform selection
Nagarajan, T.
Vijayalakshmi, P.
O'Shaughnessy, Douglas
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1595 - 1597
[26] Homophone Dominance at the Whole-word and Sub-word Levels: Spelling Errors Suggest Full-form Storage of Regularly Inflected Verb Forms
Sandra, Dominiek
LANGUAGE AND SPEECH, 2010, 53 : 405 - 444
[27] Tissue-specific and interpretable sub-segmentation of whole tumour burden on CT images by unsupervised fuzzy clustering
Rundo, Leonardo
Beer, Lucian
Ursprung, Stephan
Martin-Gonzalez, Paula
Markowetz, Florian
Brenton, James D.
Crispin-Ortuzar, Mireia
Sala, Evis
Woitek, Ramona
COMPUTERS IN BIOLOGY AND MEDICINE, 2020, 120

← 1 2 3 →