Fringe Map Based Text Line Segmentation of Printed Telugu Document Images

被引:5
|
作者
Koppula, Vijaya Kumar [1 ]
Negi, Atul [2 ]
机构
[1] CMR Coll Engn & Technol, Dept CSE, Hyderabad 501401, Andhra Pradesh, India
[2] Univ Hyderabad, Dept CIS, Hyderabad 500046, Andhra Pradesh, India
关键词
Text line segmentation; Indic scripts; Telugu OCR; Fringe Maps;
D O I
10.1109/ICDAR.2011.260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text line segmentation is a crucial and important step which can greatly influence the accuracy of an OCR system. One of the major obstacles to building high-accuracy OCR systems for Indic scripts has been the text line segmentation problem. In particular for Telugu script this problem is still to be adequately addressed by research. The common methods of Roman script are not applicable due to the inherent script complexity of Telugu. Previous approaches to Telugu OCR in the literature take a simplified view of the problem, leading to errors in line segmentation. The problem is compounded in old documents that are typeset manually and have non-uniform print quality. In this work we propose a new method using the fringe map concept. In a fringe map each pixel of the binary image is associated with a fringe number that denotes the distance to the nearest black pixel. We use fringe value information to segment text lines. First we locate peak fringe numbers (PFNs). PFNs that are not between lines are filtered out. PFNs between adjacent lines are used to construct a region. The segmenting path between the adjacent lines is found by joining the filtered PFNs of a region.
引用
收藏
页码:1294 / 1298
页数:5
相关论文
共 50 条
  • [41] Zone Segmentation of a Text Line Printed in Gurmukhi Script Newspaper
    Kaur, Rupinder Pal
    Jindal, M. K.
    Kumar, Munish
    2018 FIFTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (IEEE PDGC), 2018, : 330 - 334
  • [42] Line, word and Character Segmentation of Manipuri Machine Printed Text
    Nath, Keshab
    Jelil, Sarfaraz
    Rahul, Laishram
    2014 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS, 2014, : 203 - 206
  • [43] Handwritten and Machine Printed Text Separation from Kannada Document Images
    Pardeshi, Rajmohan
    Hangarge, Mallikarjun
    Doddamani, Srikanth
    Santosh, K. C.
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16), 2016,
  • [44] Text Line Segmentation in Images of Handwritten Historical Documents
    Sanchez, A.
    Suarez, P. D.
    Melloz, C. A. B.
    Oliveira, A. L. I.
    Alves, V. M. O.
    2008 FIRST INTERNATIONAL WORKSHOPS ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2008, : 232 - +
  • [45] Text region extraction and text segmentation on camera-captured document style images
    Song, YJ
    Kim, KC
    Choi, YW
    Byun, HR
    Kim, SH
    Chi, SY
    Jang, DK
    Chung, YK
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 172 - 176
  • [46] Skew detection, page segmentation, and script classification of printed document images
    Waked, B
    Bergler, S
    Suen, CY
    Khoury, S
    1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 4470 - 4475
  • [47] Wavelet-based images compression of color document by fuzzy picture-text segmentation
    Wu, BF
    Chiu, CC
    Lin, WL
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2003, 26 (01) : 113 - 118
  • [48] Shirorekha Extraction in Character Segmentation For Printed Devanagri Text In Document Image Processing
    Shinde, Ambadas B.
    Dandawate, Yogesh H.
    2014 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2014,
  • [49] A Hybrid Method for Text Line Extraction in Handwritten Document Images
    Kiumarsi, Ehsan
    Alaei, Alireza
    PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 241 - 246
  • [50] A multi-plane approach for text segmentation of complex document images
    Chen, Yen-Lin
    Wu, Bing-Fei
    PATTERN RECOGNITION, 2009, 42 (07) : 1419 - 1444