Fringe Map Based Text Line Segmentation of Printed Telugu Document Images

被引:5
|
作者
Koppula, Vijaya Kumar [1 ]
Negi, Atul [2 ]
机构
[1] CMR Coll Engn & Technol, Dept CSE, Hyderabad 501401, Andhra Pradesh, India
[2] Univ Hyderabad, Dept CIS, Hyderabad 500046, Andhra Pradesh, India
关键词
Text line segmentation; Indic scripts; Telugu OCR; Fringe Maps;
D O I
10.1109/ICDAR.2011.260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text line segmentation is a crucial and important step which can greatly influence the accuracy of an OCR system. One of the major obstacles to building high-accuracy OCR systems for Indic scripts has been the text line segmentation problem. In particular for Telugu script this problem is still to be adequately addressed by research. The common methods of Roman script are not applicable due to the inherent script complexity of Telugu. Previous approaches to Telugu OCR in the literature take a simplified view of the problem, leading to errors in line segmentation. The problem is compounded in old documents that are typeset manually and have non-uniform print quality. In this work we propose a new method using the fringe map concept. In a fringe map each pixel of the binary image is associated with a fringe number that denotes the distance to the nearest black pixel. We use fringe value information to segment text lines. First we locate peak fringe numbers (PFNs). PFNs that are not between lines are filtered out. PFNs between adjacent lines are used to construct a region. The segmenting path between the adjacent lines is found by joining the filtered PFNs of a region.
引用
收藏
页码:1294 / 1298
页数:5
相关论文
共 50 条
  • [21] Segmentation of text from color map images
    Tofani, P
    Kasturi, R
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 945 - 947
  • [22] Identification of Handwritten Text in Machine Printed Document Images
    Banerjee, Sandipan
    ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, VOL 2, 2013, 177 : 823 - 831
  • [23] Automatic Anonymization of Printed-Text Document Images
    Sanchez, Angel
    Velez, Jose F.
    Sanchez, Javier
    Belen Moreno, A.
    IMAGE AND SIGNAL PROCESSING (ICISP 2018), 2018, 10884 : 145 - 152
  • [24] Text line extraction for historical document images
    Saabni, Raid
    Asi, Abedelkadir
    El-Sana, Jihad
    PATTERN RECOGNITION LETTERS, 2014, 35 : 23 - 33
  • [25] FAST TEXT LINE EXTRACTION IN DOCUMENT IMAGES
    Ha, Seong Jong
    Jin, Bora
    Cho, Nam Ik
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 797 - 800
  • [26] Segmentation and Text extraction from Document Images: Survey
    Mukarambi, Gururaj
    Gaikwad, Hema
    Dhandra, B., V
    2019 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2019,
  • [27] A Novel Method for Text and Non-Text Segmentation in Document Images
    Deivalakshmi, S.
    Palanisamy, P.
    Vishwanathan, Gayatri
    2013 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2013, : 255 - 259
  • [28] Text Line Segmentation for Unconstrained Handwritten Document Images Using Neighborhood Connected Component Analysis
    Khandelwal, Abhishek
    Choudhury, Pritha
    Sarkar, Ram
    Basu, Subhadip
    Nasipuri, Mita
    Das, Nibaran
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2009, 5909 : 369 - +
  • [29] Automated Text line Segmentation and Table detection for Pre-Printed Document Image Analysis Systems
    Rani, N. Shobha
    Pruthvi, T. R.
    Rao, Aishwarya Govinda
    Bipin, Nair B. J.
    ICSPC'21: 2021 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICPSC), 2021, : 723 - 730
  • [30] A two-step framework for text line segmentation in historical Arabic and Latin document images
    Olfa Mechi
    Maroua Mehri
    Rolf Ingold
    Najoua Essoukri Ben Amara
    International Journal on Document Analysis and Recognition (IJDAR), 2021, 24 : 197 - 218