A Graph-Transformer Network for Scene Text Detection

被引:0
|
作者
Wu, Yongrong [1 ]
Lin, Jingyu [1 ]
Chen, Houjin [1 ]
Chen, Dinghao [1 ]
Yang, Lvqing [1 ]
Xiahou, Jianbing [2 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen 361000, Peoples R China
[2] Quanzhou Normal Univ, Quanzhou 362000, Fujian, Peoples R China
关键词
Scene Text Detection; Transformer; Graph convolutional network;
D O I
10.1007/978-981-99-4761-4_57
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting text in natural images with varying orientations and shapes is challenging. Existing detectors often fail with text instances having extreme aspect ratios. This paper introduces GTNet, a Graph- Transformer network for scene text detection. GTNet uses a Graph-based Shared Feature Learning Module (GSFL) for feature extraction and a Transformer-based Regression Module (TRM) for bounding box prediction. Our architecture offers a flexible receptive field, combining global attention and local features for enhanced text representation. Extensive experiments show our method surpasses existing detectors in accuracy and effectiveness.
引用
收藏
页码:680 / 690
页数:11
相关论文
共 50 条
  • [31] RECURRENT GLOBAL CONVOLUTIONAL NETWORK FOR SCENE TEXT DETECTION
    Mohanty, Sabyasachi
    Dutta, Tanima
    Gupta, Hari Prabhat
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2750 - 2754
  • [32] SPN: short path network for scene text detection
    Cai, Yuanqiang
    Wang, Weiqiang
    Ren, Haiqing
    Lu, Ke
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (10): : 6075 - 6087
  • [33] A Unified Deep Neural Network for Scene Text Detection
    Li, Yixin
    Ma, Jinwen
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2017, PT I, 2017, 10361 : 101 - 112
  • [34] Scene Text Detection with Supervised Pyramid Context Network
    Xie, Enze
    Zang, Yuhang
    Shao, Shuai
    Yu, Gang
    Yao, Cong
    Li, Guangyao
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9038 - 9045
  • [35] Scene Graph based Fusion Network for Image-Text Retrieval
    Wang, Guoliang
    Shang, Yanlei
    Chen, Yong
    Zhen, Chaoqi
    Cheng, Dequan
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 138 - 143
  • [36] CAPTIONING TRANSFORMER WITH SCENE GRAPH GUIDING
    Chen, Haishun
    Wang, Ying
    Yang, Xin
    Li, Jie
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2538 - 2542
  • [37] A Hierarchical Graph-Enhanced Transformer Network for Remote Sensing Scene Classification
    Li, Ziwei
    Xu, Weiming
    Yang, Shiyu
    Wang, Juan
    Su, Hua
    Huang, Zhanchao
    Wu, Sheng
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 20315 - 20330
  • [38] Transformer-based Scene Graph Generation Network With Relational Attention Module
    Yamamoto, Takuma
    Obinata, Yuya
    Nakayama, Osafumi
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2034 - 2041
  • [39] BGT-Net: Bidirectional GRU Transformer Network for Scene Graph Generation
    Dhingra, Naina
    Ritter, Florian
    Kunz, Andreas
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2150 - 2159
  • [40] Natural scene text detection based on multiscale connectionist text proposal network
    Huang, Min
    Lan, Chaohao
    Huang, Wei
    Tao, Yang
    JOURNAL OF ENGINEERING-JOE, 2020, 2020 (13): : 326 - 329