Towards efficient unconstrained handwriting recognition using Dilated Temporal Convolution Network

被引:20
|
作者
Sharma A. [1 ]
Jayagopi D.B. [1 ]
机构
[1] Multimodal Perception Lab, International Institute of Information Technology - Bangalore (IIIT-B), Bangalore
关键词
Dilated Temporal Convolution Network; Document analysis; Handwriting recognition;
D O I
10.1016/j.eswa.2020.114004
中图分类号
学科分类号
摘要
Recognition of cursive handwritten images has advanced well with recent recurrent architectures and attention mechanism. Most of the works focus on improving transcription performance in terms of Character Error Rate (CER) and Word Error Rate (WER). Existing models are too slow to train and test networks. Furthermore, recent studies have recommended models be not only efficient in terms of task performance but also environmentally friendly in terms of model carbon footprint. Reviewing the recent state-of-the-art models, it recommends considering model training and retraining time while designing. High training time increases costs not only in terms of resources but also in carbon footprint. This becomes challenging for handwriting recognition model with popular recurrent architectures. It is truly critical since line images usually have a very long width resulting in a longer sequence to decode. In this work, we present a fully convolution based deep network architecture for cursive handwriting recognition from line level images. The architecture is a combination of 2-D convolutions and 1-D dilated non causal convolutions with Connectionist Temporal Classification (CTC) output layer. This offers a high parallelism with a smaller number of parameters. We further demonstrate experiments with various re-scaling factors of the images and how it affects the performance of the proposed model. A data augmentation pipeline is further analyzed while model training. The experiments show our model, has comparable performance on CER and WER measures with recurrent architectures. A comparison is done with state-of-the-art models with different architectures based on Recurrent Neural Networks (RNN) and its variants. The analysis shows training performance and network details of three different dataset of English and French handwriting. This shows our model has fewer parameters and takes less training and testing time, making it suitable for low-resource and environment-friendly deployment. © 2020
引用
收藏
相关论文
共 50 条
  • [31] TEMPORAL MODELING USING DILATED CONVOLUTION AND GATING FOR VOICE-ACTIVITY-DETECTION
    Chang, Shuo-Yiin
    Li, Bo
    Simko, Gabor
    Sainath, Tara N.
    Tripathi, Anshuman
    van den Oord, Aaron
    Vinyals, Oriol
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5549 - 5553
  • [32] Multipath Lightweight Deep Network Using Randomly Selected Dilated Convolution
    Park, Sangun
    Chang, Dong Eui
    SENSORS, 2021, 21 (23)
  • [33] End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network
    Tang, Duowei
    Kuppens, Peter
    Geurts, Luc
    van Waterschoot, Toon
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [34] End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network
    Duowei Tang
    Peter Kuppens
    Luc Geurts
    Toon van Waterschoot
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [35] An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention
    Wang, Junyu
    INTERSPEECH 2023, 2023, : 3699 - 3703
  • [37] Arabic handwriting recognition system using convolutional neural network
    Altwaijry, Najwa
    Al-Turaiki, Isra
    Neural Computing and Applications, 2021, 33 (07): : 2249 - 2261
  • [38] Cursive handwriting recognition using the Hough transform and a neural network
    Ruiz-Pinales, J
    Lecolinet, E
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 231 - 234
  • [39] Arabic handwriting recognition system using convolutional neural network
    Najwa Altwaijry
    Isra Al-Turaiki
    Neural Computing and Applications, 2021, 33 : 2249 - 2261
  • [40] Hiragana handwriting recognition using deep neural network search
    Rosalina
    Hutagalung J.P.
    Sahuri G.
    International Journal of Interactive Mobile Technologies, 2020, 14 (01) : 161 - 168