A Scene Tibetan Text Detection by Combining Multi-scale and Dual-Channel Features

被引:0
|
作者
Dangzhi, Cairang [1 ,2 ,3 ]
Huang, Heming [1 ,2 ,3 ]
Fan, Yonghong [1 ,2 ,3 ]
Fan, Yutao [1 ,2 ,3 ]
机构
[1] Qinghai Normal Univ, Sch Comp, Xining 810008, Peoples R China
[2] State Key Lab Tibetan Intelligent Informat Proc &, Xining 810008, Peoples R China
[3] Minist Educ, Key Lab Tibetan Informat Proc, Xining 810008, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Multi-scale Feature; Dual-channel Attention; Scene Tibetan Text Detection; Skip Connections; YOLO;
D O I
10.1007/978-3-031-61816-1_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tibetan text detection in scenes plays a vital role in various applications, including image search, real-time translation, and the preservation of Tibetan cultural heritage. However, recognizing Tibetan text in natural scene images is a challenging task due to factors such as variable fonts, complex backgrounds, and poor imaging conditions. In this study, we present a novel approach called Multi-Scale Dual-Channel Feature Fusion (MDFF) for Tibetan scene text detection. Our method aims to accurately infer text in complex scenes by leveraging multi-scale interactions between texts. MDFF incorporates a feature pyramid network with skip connections, enabling the fusion of features at different scales in a hierarchical manner. Additionally, we employ a dual-channel attention (DCA) mechanism to capture rich interactions between text instances while mitigating the impact of background noise. Experimental results on the scene Tibetan text detection database (STTDD) demonstrate the effectiveness of MDFF, achieving an impressive F1 score of 85.20%. Our proposed method outperforms the baseline model by 5 percentage points and surpasses the performance of six state-of-the-art methods in single Tibetan text detection.
引用
收藏
页码:158 / 171
页数:14
相关论文
共 50 条
  • [31] Hierarchical Feature Fusion With Text Attention For Multi-scale Text Detection
    Liu, Chao
    Zou, Yuexian
    Guan, Wenjie
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [32] Text detection in color scene images based on unsupervised clustering of multi-channel wavelet features
    Saoi, T
    Goto, H
    Kobayashi, H
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 690 - 694
  • [33] Infrared and Visible Image Fusion Based on Multi-scale Network with Dual-channel Information Cross Fusion Block
    Yang, Yong
    Kong, Xiangkai
    Huang, Shuying
    Wan, Weiguo
    Liu, Jiaxiang
    Zhang, Wang
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [34] Multi-Scale Vehicle Detection and Tracking Method in Highway Scene
    Zhang, Tingming
    Zhao, Min
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 2066 - 2071
  • [35] Unsupervised scene adaptation for faster multi-scale pedestrian detection
    Karaman, Svebor
    Lisanti, Giuseppe
    Karaman, Svebor
    Bagdanov, Andrew D.
    Del Bimbo, Alberto
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 3534 - 3539
  • [36] Urban scene segmentation model based on multi-scale shuffle features
    Gu, Wenjuan
    Wang, Hongcheng
    Liu, Xiaobao
    Yin, Yanchao
    Xu, Biao
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (07) : 11763 - 11784
  • [37] Scene understanding based on Multi-Scale Pooling of deep learning features
    Li, DongYang
    Zhou, Yue
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING, 2015, 124 : 1732 - 1737
  • [38] Two-dimensional multi-scale perceptive context for scene text recognition
    Li, Haojie
    Yang, Daihui
    Huang, Shuangping
    Lam, Kin-Man
    Jin, Lianwen
    Zhuang, Zhenzhou
    NEUROCOMPUTING, 2020, 413 : 410 - 421
  • [39] An adaptive n-gram transformer for multi-scale scene text recognition
    Yan, Xueming
    Fang, Zhihang
    Jin, Yaochu
    KNOWLEDGE-BASED SYSTEMS, 2023, 280
  • [40] A Multi-Scale Natural Scene Text Detection Method Based on Attention Feature Extraction and Cascade Feature Fusion
    Li, Nianfeng
    Wang, Zhenyan
    Huang, Yongyuan
    Tian, Jia
    Li, Xinyuan
    Xiao, Zhiguo
    SENSORS, 2024, 24 (12)