A New Approach to Data Annotation Automation for Online Handwritten Mathematical Expression Recognition based on Recurrent Neural Networks

被引:1
|
作者
Zhelezniakov, Dmytro [1 ,2 ]
Cherneha, Anastasiia [1 ,2 ]
Zaytsev, Viktor [1 ]
Radyvonenko, Olga [1 ]
机构
[1] Samsung R&D Inst, 57 Lva Tolstogo Str, Kiev, Ukraine
[2] Taras Shevchenko Natl Univ Kyiv, Kiev, Ukraine
关键词
GENERATION;
D O I
10.1109/SMC52423.2021.9658867
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The modern recognition methods based on deep learning have established high requirements for the size of training data. However, such data is not always publicly available, often undersized, or limited by the number of classes. Preparing ground truth data is very expensive, time-consuming, and error-prone during collecting as well as annotation for many applications, particularly for optical character recognition and handwriting recognition. In many applications, such as recognition of 2-dimensional languages (diagrams, charts, mathematical formulas), annotation is further complicated by the fact that in addition to the large number of symbol classes that vary depending on the application, the spatial relations between symbols or classes must also be annotated. In this work, we propose an approach for automatic annotation of online handwritten mathematical expressions. This iterative approach provides a hierarchical annotation using an LSTM-based recognition model and a small annotated dataset as a starting point and provides an increase in the alphabet, gradually improving the recognition accuracy of new classes of symbols. The proposed approach does not imply prior verification of the gathered dataset and comprises three main stages: training recognition models, automatic annotation using recognition and matching algorithms, and automatic verification. These stages are repeated until the number of new automatically recognized and annotated samples becomes small enough. Samples that have not passed automatic verification are suspicious and require manual verification or refining, which is done at the last stage. In our experiment, more than 85% of the samples were automatically annotated. The annotation accuracy at the symbol level is more than 99%. Experimental results demonstrated that the proposed approach provided time-saving of up to 90% on manual operations. The proposed approach can also be applied to high-noise datasets.
引用
收藏
页码:1125 / 1132
页数:8
相关论文
共 50 条
  • [21] Matching based ground-truth annotation for online handwritten mathematical expressions
    Hirata, Nina S. T.
    Julca-Aguilar, Frank D.
    PATTERN RECOGNITION, 2015, 48 (03) : 837 - 848
  • [22] Character type based online handwritten Uyghur word recognition using recurrent neural network
    Simayi, Wujiahemaiti
    Ibrayim, Mayire
    Hamdulla, Askar
    WIRELESS NETWORKS, 2021,
  • [23] A new artificial neural network based approach for recognition of handwritten digits
    Agrawal, Anil Kumar
    Yadav, Susheel
    Gupta, Amit Ambar
    Pandey, Vishnu
    INTERNATIONAL JOURNAL OF APPLIED PATTERN RECOGNITION, 2023, 7 (02) : 100 - 121
  • [24] An Optimization Approach for Elementary School Handwritten Mathematical Expression Recognition
    Chevtchenko, Sergio F.
    Carvalho, Ruan
    Rodrigues, Luiz
    Souza, Everton
    Rosa, Daniel
    Cordeiro, Ripe
    Pereira, Cicero
    Vieira, Thales
    Marinho, Rarcelo
    Dermeval, Diego
    Bittencourt, Ig Ibert
    Isotani, Seiji
    Macario, Valmir
    ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024, 2024, 2151 : 234 - 241
  • [25] Recognition of Offline Handwritten Mathematical Symbols Using Convolutional Neural Networks
    Dong, Lanfang
    Liu, Hanchao
    IMAGE AND GRAPHICS (ICIG 2017), PT I, 2017, 10666 : 149 - 161
  • [26] Handwritten word recognition using Web resources and recurrent neural networks
    Cristina Oprean
    Laurence Likforman-Sulem
    Adrian Popescu
    Chafic Mokbel
    International Journal on Document Analysis and Recognition (IJDAR), 2015, 18 : 287 - 301
  • [27] Handwritten Music Recognition for Mensural notation with convolutional recurrent neural networks
    Calvo-Zaragoza, Jorge
    Toselli, Alejandro H.
    Vidal, Enrique
    PATTERN RECOGNITION LETTERS, 2019, 128 : 115 - 121
  • [28] Handwritten word recognition using Web resources and recurrent neural networks
    Oprean, Cristina
    Likforman-Sulem, Laurence
    Popescu, Adrian
    Mokbel, Chafic
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2015, 18 (04) : 287 - 301
  • [29] A New Radical-Based Approach to Online Handwritten Chinese Character Recognition
    Ma, Long-Long
    Liu, Cheng-Lin
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3322 - 3325
  • [30] Recognition of Online Handwritten Gurmukhi Strokes using Convolutional Neural Networks
    Budhouliya, Rishabh
    Sharma, Rajendra Kumar
    Singh, Harjeet
    ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2020, : 578 - 586