Visual-Textual Attention for Tree-Based Handwritten Mathematical Expression Recognition

被引:0
|
作者
Liao, Wei [1 ]
Liu, Jiayi [1 ]
Chen, Jianghan [1 ]
Wang, Qiu-Feng [1 ]
Huang, Kaizhu [2 ]
机构
[1] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Suzhou, Peoples R China
[2] Duke Kunshan Univ, Data Sci Res Ctr, Suzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Handwritten mathematical expression recognition; Tree decoder; Visual-textual attention; Mutual learning; DECODER;
D O I
10.1007/978-981-97-1417-9_35
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Handwritten mathematical expression recognition (HMER) has attracted much attention and achieved remarkable progress under the encoder-decoder framework. However, it is still challenging due to complex structures and illegible handwriting. In this paper, we propose to refine the encoder-decoder framework for HMER. Firstly, we propose a multi-scale vision and textual attention fusion mechanism to enhance the contexts from both spatial and semantic information. Next, most of HMER works simply regard the HMER as a sequence-to-sequence problem (i.e., Latex string), ignoring the structure information in the mathematical expressions. To overcome this issue, we utilize a tree decoder to capture such structure contexts. Furthermore, we propose a parent-children mutual learning method to enhance the learning of our encoder-decoder model. Extensive experiments on the HMER benchmark datasets of CROHME 2014, 2016 and 2019 demonstrate the effectiveness of the proposed method.
引用
收藏
页码:375 / 384
页数:10
相关论文
共 50 条
  • [21] Sentiment Recognition for Short Annotated GIFs Using Visual-Textual Fusion
    Liu, Tianliang
    Wan, Junwei
    Dai, Xiubin
    Liu, Feng
    You, Quanzeng
    Luo, Jiebo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (04) : 1098 - 1110
  • [22] Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition
    Zhang, Jianshu
    Du, Jun
    Dai, Lirong
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2245 - 2250
  • [23] Multimodal visual-textual object graph attention network for propaganda detection in memes
    Chen, Pengyuan
    Zhao, Lei
    Piao, Yangheran
    Ding, Hongwei
    Cui, Xiaohui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 36629 - 36644
  • [24] Bidirectional trained tree-structured decoder for Handwritten Mathematical Expression Recognition
    Cheng, Hanbo
    Liu, Chenyu
    Hu, Pengfei
    Zhang, Zhenrong
    Ma, Jiefeng
    Du, Jun
    PATTERN RECOGNITION, 2025, 165
  • [25] Multimodal visual-textual object graph attention network for propaganda detection in memes
    Pengyuan Chen
    Lei Zhao
    Yangheran Piao
    Hongwei Ding
    Xiaohui Cui
    Multimedia Tools and Applications, 2024, 83 : 36629 - 36644
  • [26] Online handwritten mathematical expression recognition
    Buyukbayrak, Hakan
    Yanikoglu, Berrin
    Ercil, Aytul
    DOCUMENT RECOGNITION AND RETRIEVAL XIV, 2007, 6500
  • [27] Online handwritten mathematical expression recognition
    Buyukbayrak, Hakan
    Yanikoglu, Berrin
    Ercil, Aytul
    2006 IEEE 14TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1 AND 2, 2006, : 730 - +
  • [28] A GRU-based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition
    Zhang, Jianshu
    Du, Jun
    Dai, Lirong
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 902 - 907
  • [29] Handwritten Mathematical Expression Recognition via Attention Aggregation Based Bi-directional Mutual Learning
    Bian, Xiaohang
    Qin, Bo
    Xin, Xiaozhe
    Li, Jianwu
    Su, Xuefeng
    Wang, Yanfeng
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 113 - 121
  • [30] Relation-Based Representation for Handwritten Mathematical Expression Recognition
    Thanh-Nghia Truong
    Huy Quang Ung
    Hung Tuan Nguyen
    Cuong Tuan Nguyen
    Nakagawa, Masaki
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021 WORKSHOPS, PT I, 2021, 12916 : 7 - 19