Multi-script text versus non-text classification of regions in scene images

被引:11
|
作者
Sriman, Bowornrat [1 ]
Schomaker, Lambert [1 ]
机构
[1] Univ Groningen, Artificial Intelligence, Nijenborgh 9, NL-9747 AG Groningen, Netherlands
关键词
Text detection in scene images; Text/non-text classification; Color features; Color histogram autocorrelation; SCALE; RECOGNITION;
D O I
10.1016/j.jvcir.2019.04.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text versus non-text region classification is an essential but difficult step in scene-image analysis due to the considerable shape complexity of text and background patterns. There exists a high probability of confusion between background elements and letter parts. This paper proposes a feature-based classification of image blocks using the color autocorrelation histogram (CAH) and the scale-invariant feature transform (SIFT) algorithm, yielding a combined scale and color-invariant feature suitable for scene-text classification. For the evaluation, features were extracted from different color spaces, applying color-histogram autocorrelation. The color features are adjoined with a SIFT descriptor. Parameter tuning is performed and evaluated. For the classification, a standard nearest-neighbor (INN) and a support vector machine (SVM) were compared. The proposed method appears to perform robustly and is especially suitable for Asian scripts such as Kannada and Thai, where urban scene-text fonts are characterized by a high curvature and salient color variations. (C) 2019 Published by Elsevier Inc.
引用
收藏
页码:23 / 42
页数:20
相关论文
共 50 条
  • [1] Classification of regions extracted from scene images by morphological filters in text or non-text using decision tree
    Luz Alves, Wonder Alexandre
    Hashimoto, Ronaldo Fumio
    WSCG 2010: FULL PAPERS PROCEEDINGS, 2010, : 165 - 172
  • [2] Multi-script Text Detection from Images: A Survey
    Dadiya, Nidhi J.
    Goswami, Mukesh M.
    2019 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2019,
  • [3] Malayalam Text and Non-Text Classification of Natural Scene Images Based on Multiple Instance Learning
    Manjaly, Anit V.
    Priya, B. Shanmuga
    2016 IEEE INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER APPLICATIONS (ICACA), 2016, : 190 - 196
  • [4] Text/non-text classification of connected components in document images
    Julca-Aguilar, Frank D.
    Maia, Ana L. L. M.
    Hirata, Nina S. T.
    2017 30TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2017, : 450 - 455
  • [5] Fast Text vs. Non-text Classification of Images
    Kralicek, Jiri
    Matas, Jiri
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 18 - 32
  • [6] Script Identification from Camera-Captured Multi-script Scene Text Components
    Jajoo, Madhuram
    Chakraborty, Neelotpal
    Mollah, Ayatullah Faruk
    Basu, Subhadip
    Sarkar, Ram
    RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS, 2019, 740 : 159 - 166
  • [7] A fast hierarchical method for multi-script and arbitrary oriented scene text extraction
    Lluis Gomez
    Dimosthenis Karatzas
    International Journal on Document Analysis and Recognition (IJDAR), 2016, 19 : 335 - 349
  • [8] A fast hierarchical method for multi-script and arbitrary oriented scene text extraction
    Gomez, Lluis
    Karatzas, Dimosthenis
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2016, 19 (04) : 335 - 349
  • [9] Robust Scene Text Detection for Multi-script Languages Using Deep Learning
    Liu, Ruo-Ze
    Sun, Xin
    Xu, Hailiang
    Shivakumara, Palaiahnakote
    Su, Feng
    Lu, Tong
    Yang, Ruoyu
    MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 329 - 340
  • [10] User interface for text and non-text classification
    Thanh Thi Xuan Lam
    Anh Duc Le
    Nakagawa, Masaki
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDAR 2019 WORKSHOP) AND 2ND INTERNATIONAL WORKSHOP ON HUMAN-DOCUMENT INTERACTION, VOL 3, 2019, : 1 - 5