Separation of Text from Non-Text Doodles of Poet Rabindranath Tagore's Manuscripts

被引:0
|
作者
Chaudhuri, B. B. [1 ]
Borah, Samarjeet [1 ]
Saraf, Ankita [1 ]
Goyal, Alisha [1 ]
Kumari, Alka [1 ]
机构
[1] Indian Stat Inst, CVPR Unit, Kolkata 700108, India
关键词
Text; Non text Doodles; Rabindranath Tagore; Connected Components; pixels; Stroke Width; EXTRACTION; SEGMENTATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As gaining popularity of internet facilities have given a convenient and faster approach to mine a warehouse of both historical and contemporary handwritten documents; this has led to a continuous research and development in the field of information retrieval algorithm. In such handwritten documents, graphics and images are combined with text and often overlap one another. This paper presents a technique for separating textual data from non-textual information. The technique is based on some already published works. It is implemented in poet Rabindranath Tagore's manuscript. The approach generates connected components as basic primitive and tries to classify them as text or non-text based on a comparison between the total number of pixels and the number of boundary pixels constituting the component. A window is generated and further separation is done on the basis of the stroke width computed for each window. The paper also contains a brief review on some of the already published works.
引用
收藏
页码:165 / 169
页数:5
相关论文
共 50 条
  • [21] Fast Text vs. Non-text Classification of Images
    Kralicek, Jiri
    Matas, Jiri
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 18 - 32
  • [22] Text segmentation by integrating hybrid strategy and non-text filtering
    Li, Minhua
    Bai, Meng
    Lv, Yingjun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (30) : 44505 - 44522
  • [23] Text and Non-text Segmentation based on Connected Component Features
    Viet Phuong Le
    Nayef, Nibal
    Visani, Muriel
    Ogier, Jean-Marc
    Cao De Tran
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1096 - 1100
  • [24] Separating Tables from Text and non-Text Objects in Printed Documents for Digital Reconstruction
    Jahan, M. A. C. Akmal
    Ragel, R. G.
    2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2017, : 147 - 151
  • [25] The poet's school and the parrot's cage: the educational spirituality of Rabindranath Tagore
    Pridmore, John
    INTERNATIONAL JOURNAL OF CHILDRENS SPIRITUALITY, 2009, 14 (04) : 355 - 367
  • [26] Offline Text and Non-text Segmentation for Hand-Drawn Diagrams
    Pravalpruk, Buntita
    Dailey, Matthew M.
    PRICAI 2016: TRENDS IN ARTIFICIAL INTELLIGENCE, 2016, 9810 : 380 - 392
  • [27] Text non-text classification based on area occupancy of equidistant pixels
    Khan, Tauseef
    Mollah, Ayatullah Faruk
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 1889 - 1900
  • [29] Discussing Cultural Influences: Text, Context and Non-Text in Rabbinic Judaism
    Garber, Zev
    AJS REVIEW-THE JOURNAL OF THE ASSOCIATION FOR JEWISH STUDIES, 2008, 32 (02): : 418 - 421
  • [30] Discussing Cultural Influences: Text, Context, and Non-Text in Rabbinic Judaism
    van Bekkum, Wout Jac.
    JOURNAL OF THE AMERICAN ORIENTAL SOCIETY, 2009, 129 (01) : 139 - 141