Fringe Map Based Text Line Segmentation of Printed Telugu Document Images

被引:5
|
作者
Koppula, Vijaya Kumar [1 ]
Negi, Atul [2 ]
机构
[1] CMR Coll Engn & Technol, Dept CSE, Hyderabad 501401, Andhra Pradesh, India
[2] Univ Hyderabad, Dept CIS, Hyderabad 500046, Andhra Pradesh, India
关键词
Text line segmentation; Indic scripts; Telugu OCR; Fringe Maps;
D O I
10.1109/ICDAR.2011.260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text line segmentation is a crucial and important step which can greatly influence the accuracy of an OCR system. One of the major obstacles to building high-accuracy OCR systems for Indic scripts has been the text line segmentation problem. In particular for Telugu script this problem is still to be adequately addressed by research. The common methods of Roman script are not applicable due to the inherent script complexity of Telugu. Previous approaches to Telugu OCR in the literature take a simplified view of the problem, leading to errors in line segmentation. The problem is compounded in old documents that are typeset manually and have non-uniform print quality. In this work we propose a new method using the fringe map concept. In a fringe map each pixel of the binary image is associated with a fringe number that denotes the distance to the nearest black pixel. We use fringe value information to segment text lines. First we locate peak fringe numbers (PFNs). PFNs that are not between lines are filtered out. PFNs between adjacent lines are used to construct a region. The segmenting path between the adjacent lines is found by joining the filtered PFNs of a region.
引用
收藏
页码:1294 / 1298
页数:5
相关论文
共 50 条
  • [1] Fringe map based text line segmentation of printed Telugu document images
    Department of CSE, CMR College of Engineering and Technology, Hyderabad 501401, India
    不详
    Proc. Int. Conf. Doc. Anal. Recognit., (1294-1298):
  • [2] Two-Stage Hybrid Binarization around Fringe Map based Text Line Segmentation for Document Images
    Jetley, Saumya
    Bethe, Swapnil
    Koppula, V. K.
    Negi, Atul
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 343 - 346
  • [3] Localization and extraction of text in Telugu document images
    Negi, A
    Kasinadhuni, N
    IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 749 - 752
  • [4] Canonical Syllable Segmentation of Telugu Document Images
    Reddy, L. Pratap
    Sastry, A. S. C. S.
    Rao, A. V. Srinivasa
    Rao, N. Venkat
    2008 IEEE REGION 10 CONFERENCE: TENCON 2008, VOLS 1-4, 2008, : 1839 - +
  • [5] Localization, extraction and recognition of text in Telugu document images
    Negi, A
    Shanker, KN
    Chereddi, CK
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 1193 - 1197
  • [6] Script-free text line segmentation using interline space model for printed document images
    Kim, Minwoo
    Oh, Il-Seok
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1354 - 1358
  • [7] An effective method for text line segmentation in historical document images
    Tien-Nam Nguyen
    Burie, Jean-Christophe
    Thi-Lan Le
    Schweyer, Anne-Valerie
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1593 - 1599
  • [8] DENSE PREDICTION FOR TEXT LINE SEGMENTATION IN HANDWRITTEN DOCUMENT IMAGES
    Quang Nhat Vo
    Lee, GueeSang
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 3264 - 3268
  • [9] Text Line Segmentation in Handwritten Document Images Using Tensor Voting
    Toan Dinh Nguyen
    Gueesang Lee
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2011, E94A (11) : 2434 - 2441
  • [10] Text Line Based Correction of Distorted Document Images
    Luo, Sanding
    Fang, Xiaomin
    Zhao, Cong
    Luo, Yisha
    TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 1494 - 1499