Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition

被引:135
|
作者
Zhang, Jianshu [1 ]
Du, Jun [1 ]
Zhang, Shiliang [1 ]
Liu, Dan [2 ]
Hu, Yulong [2 ]
Hu, Jinshui [2 ]
Wei, Si [2 ]
Dai, Lirong [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Anhui, Peoples R China
[2] IFLYTEK Res, Hefei, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Handwritten mathematical expression; recognition; Neural network; Attention; FEATURES;
D O I
10.1016/j.patcog.2017.06.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine recognition of a handwritten mathematical expression (HME) is challenging due to the ambiguities of handwritten symbols and the two-dimensional structure of mathematical expressions. Inspired by recent work in deep learning, we present Watch, Attend and Parse (WAP), a novel end-to-end approach based on neural network that learns to recognize HMEs in a two-dimensional layout and outputs them as one-dimensional character sequences in LaTeX format. Inherently unlike traditional methods, our proposed model avoids problems that stem from symbol segmentation, and it does not require a predefined expression grammar. Meanwhile, the problems of symbol recognition and structural analysis are handled, respectively, using a watcher and a parser. We employ a convolutional neural network encoder that takes HME images as input as the watcher and employ a recurrent neural network decoder equipped with an attention mechanism as the parser to generate LaTeX sequences. Moreover, the correspondence between the input expressions and the output LaTeX sequences is learned automatically by the attention mechanism. We validate the proposed approach on a benchmark published by the CROHME international competition. Using the official training dataset, WAP significantly outperformed the state-of-the-art method with an expression recognition accuracy of 46.55% on CROHME 2014 and 44.55% on CROHME 2016. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:196 / 206
页数:11
相关论文
共 50 条
  • [21] Context-Aware Mathematical Expression Recognition: An End-to-End Framework and A Benchmark
    He, Wenhao
    Luo, Yuxuan
    Yin, Fei
    Hu, Han
    Han, Junyu
    Ding, Errui
    Liu, Cheng-Lin
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3246 - 3251
  • [22] An End-to-End Approach for Bearing Fault Diagnosis Based on a Deep Convolution Neural Network
    Chen, Liang
    Zhuang, Yuxuan
    Zhang, Jinghua
    Wang, Jianming
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 101 - 109
  • [23] OctShuffleMLT: A Compact Octave Based Neural Network for End-to-End Multilingual Text Detection and Recognition
    Lundgren, Antonio
    Castro, Dayvid
    Lima, Estanislau
    Bezerra, Byron
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW) AND 8TH INTERNATIONAL WORKSHOP ON CAMERA-BASED DOCUMENT ANALYSIS AND RECOGNITION, VOL 4, 2019, : 37 - 42
  • [24] End-to-End Speech Emotion Recognition Based on One-Dimensional Convolutional Neural Network
    Gao, Mengna
    Dong, Jing
    Zhou, Dongsheng
    Zhang, Qiang
    Yang, Deyun
    3RD INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE (ICIAI 2019), 2019, : 78 - 82
  • [25] Handwritten Mathematical Expression Recognition Using Convolutional Neural Network
    Giang-Son Tran
    Chi-Kien Huynh
    Thanh-Sach Le
    Tan-Phuc Phan
    Khanh-Ngoc Bui
    2018 3RD INTERNATIONAL CONFERENCE ON CONTROL, ROBOTICS AND CYBERNETICS (CRC), 2018, : 15 - 19
  • [26] End-to-end neural network based optimal quadcopter control
    Ferede, Robin
    de Croon, Guido
    De Wagter, Christophe
    Izzo, Dario
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 172
  • [27] END-TO-END NEURAL NETWORK BASED AUTOMATED SPEECH SCORING
    Chen, Lei
    Tao, Jidong
    Ghaffarzadegan, Shabnam
    Qian, Yao
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6234 - 6238
  • [28] A comprehensive comparison of end-to-end approaches for handwritten digit string recognition
    Hochuli, Andre G.
    Britto Jr, Alceu S.
    Saji, David A.
    Saavedra, Jose M.
    Sabourin, Robert
    Oliveira, Luiz S.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165 (165)
  • [29] Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition
    Bluche, Theodore
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [30] End-to-End page-Level assessment of handwritten text recognition
    Vidal, Enrique
    Toselli, Alejandro H.
    Rios-Vila, Antonio
    Calvo-Zaragoza, Jorge
    PATTERN RECOGNITION, 2023, 142