Regularized Urdu Speech Recognition with Semi-Supervised Deep Learning

被引:11
|
作者
Humayun, Mohammad Ali [1 ]
Hameed, Ibrahim A. [2 ]
Shah, Syed Muslim [1 ]
Khan, Sohaib Hassan [1 ]
Zafar, Irfan [1 ]
Bin Ahmed, Saad [3 ]
Shuja, Junaid [4 ]
机构
[1] Univ Engn & Technol Peshawar, Dept Elect Engn, Inst Commun Technol ICT Campus, Islamabad 44000, Pakistan
[2] Norwegian Univ Sci & Technol, Fac Informat Technol & Elect Engn, Dept ICT & Nat Sci, N-6001 Alesund, Norway
[3] Univ Teknol Malaysia, M JIIT, Jalan Sultan Yahya Petra, Kuala Lumpur 54100, Malaysia
[4] COSMATS Univ Islamabad, Dept Comp Sci, Abbottabad Campus, Abbottabad 22010, Pakistan
来源
APPLIED SCIENCES-BASEL | 2019年 / 9卷 / 09期
关键词
speech recognition; locally linear embedding; label propagation; Maxout; low resource languages;
D O I
10.3390/app9091956
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Automatic Speech Recognition, (ASR) has achieved the best results for English, with end-to-end neural network based supervised models. These supervised models need huge amounts of labeled speech data for good generalization, which can be quite a challenge to obtain for low-resource languages like Urdu. Most models proposed for Urdu ASR are based on Hidden Markov Models (HMMs). This paper proposes an end-to-end neural network model, for Urdu ASR, regularized with dropout, ensemble averaging and Maxout units. Dropout and ensembles are averaging techniques over multiple neural network models while Maxout are units in a neural network which adapt their activation functions. Due to limited labeled data, Semi Supervised Learning (SSL) techniques are also incorporated to improve model generalization. Speech features are transformed into a lower dimensional manifold using an unsupervised dimensionality-reduction technique called Locally Linear Embedding (LLE). Transformed data along with higher dimensional features is used to train neural networks. The proposed model also utilizes label propagation-based self-training of initially trained models and achieves a Word Error Rate (WER) of 4% less than that reported as the benchmark on the same Urdu corpus using HMM. The decrease in WER after incorporating SSL is more significant with an increased validation data size.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Deep Semi-Supervised Learning With Contrastive Learning in Large Vocabulary Automatic Chord Recognition
    Li, Chen
    Li, Yu
    Song, Hui
    Tian, Lihua
    2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC, 2023, : 1065 - 1069
  • [42] Activity recognition based on semi-supervised learning
    Guan, Donghai
    Yuan, Weiwei
    Lee, Young-Koo
    Gavrilov, Andrey
    Lee, Sungyoung
    13TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2007, : 469 - +
  • [43] Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks
    Lee, Wonkyum
    Hang, Kyu J.
    Lane, Ian
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3843 - 3847
  • [44] A Novel Manifold Regularized Online Semi-supervised Learning Model
    Shuguang Ding
    Xuanyang Xi
    Zhiyong Liu
    Hong Qiao
    Bo Zhang
    Cognitive Computation, 2018, 10 : 49 - 61
  • [45] Tactile Object Recognition with Semi-supervised Learning
    Luo, Shan
    Liu, Xiaozhou
    Althoefer, Kaspar
    Liu, Hongbin
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2015), PT II, 2015, 9245 : 15 - 26
  • [46] Semi-Supervised Multitask Learning for Scene Recognition
    Lu, Xiaoqiang
    Li, Xuelong
    Mou, Lichao
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (09) : 1967 - 1976
  • [47] Semi-supervised learning for tongue constitution recognition
    Ma, Yichao
    Wu, Chunhong
    Li, Tian
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [48] A Novel Manifold Regularized Online Semi-supervised Learning Model
    Ding, Shuguang
    Xi, Xuanyang
    Liu, Zhiyong
    Qiao, Hong
    Zhang, Bo
    COGNITIVE COMPUTATION, 2018, 10 (01) : 49 - 61
  • [49] A Novel Manifold Regularized Online Semi-supervised Learning Algorithm
    Ding, Shuguang
    Xi, Xuanyang
    Liu, Zhiyong
    Qiao, Hong
    Zhang, Bo
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT I, 2016, 9947 : 597 - 605
  • [50] Graph Regularized Variational Ladder Networks for Semi-Supervised Learning
    Hu, Cong
    Song, Xiao-Ning
    IEEE ACCESS, 2020, 8 : 206280 - 206288