A Scene Tibetan Text Detection by Combining Multi-scale and Dual-Channel Features

被引:0
|
作者
Dangzhi, Cairang [1 ,2 ,3 ]
Huang, Heming [1 ,2 ,3 ]
Fan, Yonghong [1 ,2 ,3 ]
Fan, Yutao [1 ,2 ,3 ]
机构
[1] Qinghai Normal Univ, Sch Comp, Xining 810008, Peoples R China
[2] State Key Lab Tibetan Intelligent Informat Proc &, Xining 810008, Peoples R China
[3] Minist Educ, Key Lab Tibetan Informat Proc, Xining 810008, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Multi-scale Feature; Dual-channel Attention; Scene Tibetan Text Detection; Skip Connections; YOLO;
D O I
10.1007/978-3-031-61816-1_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tibetan text detection in scenes plays a vital role in various applications, including image search, real-time translation, and the preservation of Tibetan cultural heritage. However, recognizing Tibetan text in natural scene images is a challenging task due to factors such as variable fonts, complex backgrounds, and poor imaging conditions. In this study, we present a novel approach called Multi-Scale Dual-Channel Feature Fusion (MDFF) for Tibetan scene text detection. Our method aims to accurately infer text in complex scenes by leveraging multi-scale interactions between texts. MDFF incorporates a feature pyramid network with skip connections, enabling the fusion of features at different scales in a hierarchical manner. Additionally, we employ a dual-channel attention (DCA) mechanism to capture rich interactions between text instances while mitigating the impact of background noise. Experimental results on the scene Tibetan text detection database (STTDD) demonstrate the effectiveness of MDFF, achieving an impressive F1 score of 85.20%. Our proposed method outperforms the baseline model by 5 percentage points and surpasses the performance of six state-of-the-art methods in single Tibetan text detection.
引用
收藏
页码:158 / 171
页数:14
相关论文
共 50 条
  • [21] Scene Text Removal Based on Multi-scale Attention Mechanism
    He, Ping
    Zhang, Heng
    Liu, Chenglin
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2022, 35 (07): : 614 - 624
  • [22] Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition
    Yao, Cong
    Bai, Xiang
    Shi, Baoguang
    Liu, Wenyu
    2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 4042 - 4049
  • [23] Dual Constraint Parallel Multi-scale Attention Network for Insulator Detection in Foggy Scene
    Sun, Hang
    Huang, Longhui
    Yu, Mei
    Ren, Dong
    Fu, Qiuyue
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XII, 2025, 15042 : 287 - 300
  • [24] Robust Scene Text Detection Under Occlusion via Multi-scale Adaptive Deep Network
    Dinh, My-Tham
    Minh-Trieu Tran
    Quang-Vinh Dang
    Lee, Guee-Sang
    FRONTIERS OF COMPUTER VISION, IW-FCV 2023, 2023, 1857 : 122 - 134
  • [25] Time-varying speed fault diagnosis based on dual-channel parallel multi-scale information
    Wang, Hongchao
    Xue, Guoqing
    Yu, Li
    Li, Simin
    Guo, Zhiqiang
    Du, Wenliao
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2024, 38 (11) : 5961 - 5978
  • [26] Fine-Grained Modulation Classification Using Multi-Scale Radio Transformer With Dual-Channel Representation
    Zheng, Qinghe
    Zhao, Penghui
    Wang, Hongjun
    Elhanashi, Abdussalam
    Saponara, Sergio
    IEEE COMMUNICATIONS LETTERS, 2022, 26 (06) : 1298 - 1302
  • [27] Pyrboxes: An efficient multi-scale scene text detector with feature pyramids
    Sheng, Fenfen
    Chen, Zhineng
    Zhang, Wei
    Xu, Bo
    PATTERN RECOGNITION LETTERS, 2019, 125 : 228 - 234
  • [28] DCMS-YOLOv5: A Dual-Channel and Multi-Scale Vertical Expansion Helmet Detection Model Based on YOLOv5
    Liu, Yulu
    Tian, Ying
    ENGINEERING LETTERS, 2023, 31 (01) : 1 - 7
  • [29] Chinese Short Text Sentiment Classification Model Integrating Dual-Channel Features
    Zang, Jie
    Lu, Jintao
    Wang, Yan
    Li, Xiang
    Liao, Huizhi
    Computer Engineering and Applications, 2024, 60 (21) : 116 - 126
  • [30] Transforming Scene Text Detection and Recognition: A Multi-Scale End-to-End Approach With Transformer Framework
    Geng, Tianyu
    IEEE ACCESS, 2024, 12 : 40582 - 40596