A Scene Tibetan Text Detection by Combining Multi-scale and Dual-Channel Features

被引:0
|
作者
Dangzhi, Cairang [1 ,2 ,3 ]
Huang, Heming [1 ,2 ,3 ]
Fan, Yonghong [1 ,2 ,3 ]
Fan, Yutao [1 ,2 ,3 ]
机构
[1] Qinghai Normal Univ, Sch Comp, Xining 810008, Peoples R China
[2] State Key Lab Tibetan Intelligent Informat Proc &, Xining 810008, Peoples R China
[3] Minist Educ, Key Lab Tibetan Informat Proc, Xining 810008, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Multi-scale Feature; Dual-channel Attention; Scene Tibetan Text Detection; Skip Connections; YOLO;
D O I
10.1007/978-3-031-61816-1_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tibetan text detection in scenes plays a vital role in various applications, including image search, real-time translation, and the preservation of Tibetan cultural heritage. However, recognizing Tibetan text in natural scene images is a challenging task due to factors such as variable fonts, complex backgrounds, and poor imaging conditions. In this study, we present a novel approach called Multi-Scale Dual-Channel Feature Fusion (MDFF) for Tibetan scene text detection. Our method aims to accurately infer text in complex scenes by leveraging multi-scale interactions between texts. MDFF incorporates a feature pyramid network with skip connections, enabling the fusion of features at different scales in a hierarchical manner. Additionally, we employ a dual-channel attention (DCA) mechanism to capture rich interactions between text instances while mitigating the impact of background noise. Experimental results on the scene Tibetan text detection database (STTDD) demonstrate the effectiveness of MDFF, achieving an impressive F1 score of 85.20%. Our proposed method outperforms the baseline model by 5 percentage points and surpasses the performance of six state-of-the-art methods in single Tibetan text detection.
引用
收藏
页码:158 / 171
页数:14
相关论文
共 50 条
  • [41] MMSNet: Multi-modal scene recognition using multi-scale encoded features
    Caglayan, Ali
    Imamoglu, Nevrez
    Nakamura, Ryosuke
    IMAGE AND VISION COMPUTING, 2022, 122
  • [42] Natural Scene Text Detection Based on Multi-Channel FASText
    Guo Chenfeng
    Liu Juhua
    PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL AND INFORMATION ENGINEERING (ICACIE 2017), 2017, 119 : 16 - 20
  • [43] Scene Text Detection via Edge Cue and Multi-Features
    Tang, Youbao
    Wu, Xiangqian
    PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2016, : 156 - 161
  • [44] Remote Sensing Image Fusion via Boundary Measured Dual-Channel PCNN in Multi-Scale Morphological Gradient Domain
    Tan, Wei
    Xiang, Pei
    Zhang, Jiajia
    Zhou, Huixin
    Qin, Hanlin
    IEEE ACCESS, 2020, 8 : 42540 - 42549
  • [45] Saliency Detection Based on Multi-Scale Image Features
    Zheng, Chaoqun
    Zheng, Xiaozhi
    Wang, Guizhong
    Tian, Shuo
    Guo, Qiang
    2016 IEEE 14TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 14TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 2ND INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/DATACOM/CYBERSC, 2016, : 223 - 227
  • [46] Pedestrian Detection Based on Multi-Scale Fusion Features
    Jiang, Hao
    Zhang, Chuang
    Wu, Ming
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 329 - 333
  • [47] MULTI-SCALE SHARED FEATURES FOR CASCADE OBJECT DETECTION
    Lin, Zhe
    Hua, Gang
    Davis, Larry S.
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 1865 - 1868
  • [48] Image Objects and Multi-Scale Features for Annotation Detection
    Chen, Jindong
    Saund, Eric
    Wang, Yizhou
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2617 - 2621
  • [49] Remote Sensing Image Object Detection by Fusing Multi-Scale Contextual Features and Channel Enhancement
    Ma, Xuesen
    Dong, Jindian
    Wei, Weixin
    Zheng, Biao
    Ma, Ji
    Zhou, Tianbao
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [50] Multi-scale document image rectification utilising text-features
    Sun, Riming
    Wang, Shengfa
    Ji, Lin
    Wang, Zhenyu
    ELECTRONICS LETTERS, 2018, 54 (08) : 502 - 503