Clustering of Farsi Sub-word Images for Whole-book Recognition

被引:0
|
作者
Soheili, Mohammad Reza [1 ,2 ]
Kabir, Ehsanollah [1 ]
Stricker, Didier [2 ]
机构
[1] Tarbiat Modares Univ, Dept Elect & Comp Engn, Tehran, Iran
[2] German Res Ctr Artificial Intelligence, Kaiserslautern, Germany
来源
关键词
document image analysis; sub-word image; incremental clustering; shape matching; large document; Persian;
D O I
10.1117/12.2075931
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a sub-word image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.
引用
收藏
页数:12
相关论文
共 27 条
  • [21] MAP and Sub-Word Level T-Norm for Text-Dependent Speaker Recognition
    Toledano, Doroteo T.
    Hernandez-Lopez, Daniel
    Esteve-Elizalde, Cristina
    Gonzalez-Rodriguez, Joaquin
    Fernandez Pozo, Ruben
    Hernandez Gomez, Luis
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1933 - +
  • [22] Merging Clustering and Classification Results for Whole Book Recognition
    Soheili, Mohammad Reza
    Yousefi, Mohammad Reza
    Kabir, Ehsanollah
    Stricker, Didier
    2017 10TH IRANIAN CONFERENCE ON MACHINE VISION AND IMAGE PROCESSING (MVIP), 2017, : 134 - 138
  • [23] Sub-Word Unit based Non-Audible Speech Recognition using Surface Electromyography
    Walliczek, Matthias
    Kraft, Florian
    Jou, Szu-Chen
    Schultz, Tanja
    Waibel, Alex
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1487 - +
  • [24] Sub-word Based End-to-End Speech Recognition for an Under-Resourced Language: Amharic
    Gebreegziabher, Nirayo Hailu
    Nuernberger, Andreas
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3466 - 3470
  • [25] Combining multiple-sized sub-word units in a speech recognition system using baseform selection
    Nagarajan, T.
    Vijayalakshmi, P.
    O'Shaughnessy, Douglas
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1595 - 1597
  • [26] Homophone Dominance at the Whole-word and Sub-word Levels: Spelling Errors Suggest Full-form Storage of Regularly Inflected Verb Forms
    Sandra, Dominiek
    LANGUAGE AND SPEECH, 2010, 53 : 405 - 444
  • [27] Tissue-specific and interpretable sub-segmentation of whole tumour burden on CT images by unsupervised fuzzy clustering
    Rundo, Leonardo
    Beer, Lucian
    Ursprung, Stephan
    Martin-Gonzalez, Paula
    Markowetz, Florian
    Brenton, James D.
    Crispin-Ortuzar, Mireia
    Sala, Evis
    Woitek, Ramona
    COMPUTERS IN BIOLOGY AND MEDICINE, 2020, 120