Bilateral Convolutional Activations Encoded with Fisher Vectors for Scene Character Recognition

被引:2
|
作者
Zhang, Zhong [1 ]
Wang, Hong [1 ]
Liu, Shuang [1 ]
Durrani, Tariq S. [2 ]
机构
[1] Tianjin Normal Univ, Tianjin Key Lab Wireless Mobile Commun & Power Tr, Tianjin, Peoples R China
[2] Univ Strathclyde, Dept Elect & Elect Engn, Glasgow, Lanark, Scotland
基金
中国国家自然科学基金;
关键词
bilateral convolutional activations; Fisher vectors; scene character recognition; TEXT; REPRESENTATION;
D O I
10.1587/transinf.2017EDL8238
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A rich and robust representation for scene characters plays a significant role in automatically understanding the text in images. In this letter, we focus on the issue of feature representation, and propose a novel encoding method named bilateral convolutional activations encoded with Fisher vectors (BCA-FV) for scene character recognition. Concretely, we first extract convolutional activation descriptors from convolutional maps and then build a bilateral convolutional activation map (BCAM) to capture the relationship between the convolutional activation response and the spatial structure information. Finally, in order to obtain the global feature representation, the BCAM is injected into FV to encode convolutional activation descriptors. Hence, the BCA-FV can effectively integrate the prominent features and spatial structure information for character representation. We verify our method on two widely used databases (ICDAR2003 and Chars74K), and the experimental results demonstrate that our method achieves better results than the state-of-the-art methods. In addition, we further validate the proposed BCA-FV on the "Pan+ChiPhoto" database for Chinese scene character recognition, and the experimental results show the good generalization ability of the proposed BCA-FV.
引用
收藏
页码:1453 / 1456
页数:4
相关论文
共 50 条
  • [21] VERY HIGH RESOLUTION IMAGE SCENE CLASSIFICATION WITH SEMANTIC FISHER VECTORS
    Chaib, Souleyman
    Gu, Yanfeng
    Yao, Hongxun
    Belkadi, Khaled
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 6844 - 6847
  • [22] Optical Character Recognition for Scene Text Detection, Mining and Recognition
    Nathiya, N.
    Pradeepa, K.
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2013, : 662 - 665
  • [23] Dimensionality reduction of Fisher vectors for human action recognition
    Oruganti, Venkata Ramana Murthy
    Goecke, Roland
    IET COMPUTER VISION, 2016, 10 (05) : 392 - 397
  • [24] RNN Fisher Vectors for Action Recognition and Image Annotation
    Lev, Guy
    Sadeh, Gil
    Klein, Benjamin
    Wolf, Lior
    COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 : 833 - 850
  • [25] Hierarchical Coding of Convolutional Features for Scene Recognition
    Xie, Lin
    Lee, Feifei
    Liu, Li
    Yin, Zhong
    Chen, Qiu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (05) : 1182 - 1192
  • [26] Sparse Decomposition of Convolutional Features for Scene Recognition
    Xie, Lin
    Lee, Feifei
    Yan, Yan
    Chen, Qiu
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND APPLICATIONS (ICCIA), 2017, : 345 - 348
  • [27] Convolutional Attention Networks for Scene Text Recognition
    Xie, Hongtao
    Fang, Shancheng
    Zha, Zheng-Jun
    Yang, Yating
    Li, Yan
    Zhang, Yongdong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (01)
  • [28] FISHER VECTOR ENCODED DEEP CONVOLUTIONAL FEATURES FOR UNCONSTRAINED FACE VERIFICATION
    Chen, Jun-Cheng
    Zheng, Jingxiao
    Patel, Vishal M.
    Chellappa, Rama
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 2981 - 2985
  • [29] SCENE TEXT RECOGNITION WITH TEMPORAL CONVOLUTIONAL ENCODER
    Du, Xiangcheng
    Ma, Tianlong
    Zheng, Yingbin
    Ye, Hao
    Wu, Xingjiao
    He, Liang
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2383 - 2387
  • [30] Convolutional neural network vectors for speaker recognition
    Hourri, Soufiane
    Nikolov, Nikola S.
    Kharroubi, Jamal
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (02) : 389 - 400