Towards efficient unconstrained handwriting recognition using Dilated Temporal Convolution Network

被引:20
|
作者
Sharma A. [1 ]
Jayagopi D.B. [1 ]
机构
[1] Multimodal Perception Lab, International Institute of Information Technology - Bangalore (IIIT-B), Bangalore
关键词
Dilated Temporal Convolution Network; Document analysis; Handwriting recognition;
D O I
10.1016/j.eswa.2020.114004
中图分类号
学科分类号
摘要
Recognition of cursive handwritten images has advanced well with recent recurrent architectures and attention mechanism. Most of the works focus on improving transcription performance in terms of Character Error Rate (CER) and Word Error Rate (WER). Existing models are too slow to train and test networks. Furthermore, recent studies have recommended models be not only efficient in terms of task performance but also environmentally friendly in terms of model carbon footprint. Reviewing the recent state-of-the-art models, it recommends considering model training and retraining time while designing. High training time increases costs not only in terms of resources but also in carbon footprint. This becomes challenging for handwriting recognition model with popular recurrent architectures. It is truly critical since line images usually have a very long width resulting in a longer sequence to decode. In this work, we present a fully convolution based deep network architecture for cursive handwriting recognition from line level images. The architecture is a combination of 2-D convolutions and 1-D dilated non causal convolutions with Connectionist Temporal Classification (CTC) output layer. This offers a high parallelism with a smaller number of parameters. We further demonstrate experiments with various re-scaling factors of the images and how it affects the performance of the proposed model. A data augmentation pipeline is further analyzed while model training. The experiments show our model, has comparable performance on CER and WER measures with recurrent architectures. A comparison is done with state-of-the-art models with different architectures based on Recurrent Neural Networks (RNN) and its variants. The analysis shows training performance and network details of three different dataset of English and French handwriting. This shows our model has fewer parameters and takes less training and testing time, making it suitable for low-resource and environment-friendly deployment. © 2020
引用
收藏
相关论文
共 50 条
  • [21] Micro Hand Gesture Recognition System Using Hybrid Dilated Convolution
    Dong, Yaoyao
    Qu, Wei
    Gao, Tianhao
    Jiang, Haohao
    Wang, Pengda
    INTERNATIONAL CONFERENCE ON INTELLIGENT TRAFFIC SYSTEMS AND SMART CITY (ITSSC 2021), 2022, 12165
  • [22] Vehicle recognition using convolution neural network
    Khan, Maleika Heenaye-Mamode
    Khan, Chonnoo Abubakar Siddick
    Oumeir, Rengony Mohammad
    INTERNATIONAL JOURNAL OF BIOMETRICS, 2023, 15 (3-4) : 344 - 358
  • [23] An Approach towards Malayalam Handwriting Recognition Using Dissimilar Classifiers
    Alex, Meenu
    Das, Smija
    1ST GLOBAL COLLOQUIUM ON RECENT ADVANCEMENTS AND EFFECTUAL RESEARCHES IN ENGINEERING, SCIENCE AND TECHNOLOGY - RAEREST 2016, 2016, 25 : 224 - 231
  • [24] Cascaded Adaptive Dilated Temporal Convolution Network-Based Efficient Sentiment Analysis Model from Social Media Posts
    Liya, B. S.
    Indumathy, P.
    Hemlathadhevi, A.
    Dharaniya, R.
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2024,
  • [25] TC-Net: A Modest & Lightweight Emotion Recognition System Using Temporal Convolution Network
    Ishaq M.
    Khan M.
    Kwon S.
    Computer Systems Science and Engineering, 2023, 46 (03): : 3355 - 3369
  • [26] EGTCN: An Efficient Graph and Temporal Convolution Network for Sensor-Based Human Activity Recognition in Federated Learning
    Yussif, Sophyani Banaamwini
    Xie, Ning
    Yang, Yang
    Huang, Yanbin
    Wang, Guan
    Du, Zhenjian
    IEEE SENSORS JOURNAL, 2024, 24 (21) : 34892 - 34906
  • [27] Traffic Policeman Gesture Recognition With Spatial Temporal Graph Convolution Network
    Singh, Apoory
    Choudhary, Atka
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 40 - 41
  • [28] STCA: an action recognition network with spatio-temporal convolution and attention
    Tian, Qiuhong
    Miao, Weilun
    Zhang, Lizao
    Yang, Ziyu
    Yu, Yang
    Zhao, Yanying
    Yao, Lan
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2025, 14 (01)
  • [29] Convolution spatial-temporal attention network for EEG emotion recognition
    Cao, Lei
    Yu, Binlong
    Dong, Yilin
    Liu, Tianyu
    Li, Jie
    PHYSIOLOGICAL MEASUREMENT, 2024, 45 (12)
  • [30] A Multiscale Dynamic Temporal Convolution Network For Continuous Dimensional Emotion Recognition
    Hu, Min
    Sun, Jialu
    Wang, Xiaohua
    An, Ning
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,