Soft set-based MSER end-to-end system for occluded scene text detection, recognition and prediction

被引:0
|
作者
Das, Alloy [1 ]
Palaiahnakote, Shivakumara [2 ]
Banerjee, Ayan [1 ]
Antonacopoulos, Apostolos [2 ]
Pal, Umapada [1 ]
机构
[1] Indian Stat Inst, Comp Vis & Pattern Recognit Unit, Kolkata, India
[2] Univ Salford, Pattern Recognit & Image Anal PRImA Res Lab, Manchester, England
关键词
Scene text detection; Scene text recognition; Scene text correction; Occluded scene text; Graph neural network; Convolutional recurrent neural network; Convolutional neural network;
D O I
10.1016/j.knosys.2024.112593
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The presence of unpredictable occlusions on natural scene text is a significant challenge, exacerbating the difficulties already posed on text detection and recognition by the variability of such images. Addressing the need for a robust, consistently performing approach that can effectively address the above challenges, this paper presents a new Soft Set-based end-to-end system for text detection, recognition and prediction in occluded natural scene images. This is the first approach to integrate text detection, recognition and prediction, unlike existing systems developed for end-to-end text spotting (text detection and recognition) only. For candidate text components detection, the proposed combination of Soft Sets with Maximally Stable Extremal Regions (SSMSER) improves text detection and spotting in natural scene images, irrespectively of the presence of arbitrarily orientated and shaped text, complex backgrounds and occlusion. Furthermore, a Graph Recurrent Neural Network is proposed for grouping candidate text components into text lines and for fitting accurate bounding boxes to each word. Finally, a Convolutional Recurrent Neural Network (CRNN) is proposed for the recognition of text and for predicting missing characters due to occlusion. Experimental results on a new occluded scene text dataset (OSTD) and on the most relevant benchmark natural scene text datasets demonstrate that the proposed system outperforms the state-of-the-art in text detection, recognition and prediction. The code and dataset are available at https://github.com/alloydas/Softset-MSER-Based-Occluded-Scene-Text-Spotting/blob/master/S oft_set_MSER.ipynb
引用
收藏
页数:19
相关论文
共 50 条
  • [41] End-to-end Chinese character detection in natural scene based on improved YOLOv2
    Liu J.
    Zhu X.
    Song M.-M.
    Kongzhi yu Juece/Control and Decision, 2021, 36 (10): : 2483 - 2489
  • [42] Capsule Network based End-to-end System for Detection of Replay Attacks
    Ouyang, Meidan
    Das, Rohan Kumar
    Yang, Jichen
    Li, Haizhou
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [43] Development of CRF and CTC Based End-To-End Kazakh Speech Recognition System
    Oralbekova, Dina
    Mamyrbayev, Orken
    Othman, Mohamed
    Alimhan, Keylan
    Zhumazhanov, Bagashar
    Nuranbayeva, Bulbul
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT I, 2022, 13757 : 519 - 531
  • [44] Hardware Accelerator for Transformer based End-to-End Automatic Speech Recognition System
    Yamini, Shaarada D.
    Mirishkar, Ganesh S.
    Vuppala, Anil Kumar
    Purini, Suresh
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 93 - 100
  • [45] Myanmar Text-to-Speech System based on Tacotron (End-to-End Generative Model)
    Win, Yuzana
    Lwin, Htoo Pyae
    Masada, Tomonari
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 572 - 577
  • [46] SEE-LPR: A Semantic Segmentation Based End-to-End System for Unconstrained License Plate Detection and Recognition
    Tang, Dongqi
    Kong, Hao
    Meng, Xi
    Liu, Ruo-Ze
    Lu, Tong
    MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 543 - 554
  • [47] End-to-end aluminum strip surface defects detection and recognition method based on ViBe
    Ye G.
    Li Y.-B.
    Ma Z.-X.
    Cheng J.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2020, 54 (10): : 1906 - 1914
  • [48] End-to-End Light License Plate Detection and Recognition Method Based on Deep Learning
    Ma, Zongfang
    Wu, Zheping
    Cao, Yonggen
    ELECTRONICS, 2023, 12 (01)
  • [49] An attention-based end-to-end model for multiple text lines recognition in japanese historical documents
    Ly, Nam Tuan
    Nguyen, Cuong Tuan
    Nakagawa, Masaki
    Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, 2019, : 629 - 634
  • [50] CONTEXT-AWARE MASK PREDICTION NETWORK FOR END-TO-END TEXT-BASED SPEECH EDITING
    Wang, Tao
    Yi, Jiangyan
    Deng, Liqun
    Fu, Ruibo
    Tao, Jianhua
    Wen, Zhengqi
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6082 - 6086