Training augmentation with TANDEM acoustic modelling in Punjabi adult speech recognition system

被引:0
|
作者
Virender Kadyan
Shashi Bala
Puneet Bawa
机构
[1] University of Petroleum & Energy Studies (UPES),Department of Informatics, School of Computer Science
[2] Chitkara University Institute of Engineering and Technology,Centre of Excellence for Speech and Multimodal Laboratory
[3] Chitkara University,undefined
关键词
Tandem-NN; Data augmentation; Bottleneck features; Punjabi ASR; DNN-HMM;
D O I
暂无
中图分类号
学科分类号
摘要
Processing of low resource pre and post acoustic signals always faced the challenge of data scarcity in its training module. It’s difficult to obtain high system accuracy with limited corpora in train set which results into extraction of large discriminative feature vector. These vectors information are distorted due to acoustic mismatch occurs because of real environment and inter speaker variations. In this paper, context independent information of an input speech signal is pre-processed using bottleneck features and later in modeling phase Tandem-NN model has been employ to enhance system accuracy. Later to fulfill the requirement of train data issues, in-domain training augmentation is perform using fusion of original clean and artificially created modified train noisy data and to further boost this training data, tempo modification of input speech signal is perform with maintenance of its spectral envelope and pitch in corresponding input audio signal. Experimental result shows that a relative improvement of 13.53% is achieved in clean and 32.43% in noisy conditions with Tandem-NN system in comparison to that of baseline system respectively.
引用
收藏
页码:473 / 481
页数:8
相关论文
共 50 条
  • [11] Acoustic modelling for Croatian speech recognition and synthesis
    Martincic-Ipsic, Sanda
    Ribaric, Slobodan
    Ipsic, Ivo
    INFORMATICA, 2008, 19 (02) : 227 - 254
  • [12] Training Augmentation with Adversarial Examples for Robust Speech Recognition
    Sun, Sining
    Yeh, Ching-Feng
    Ostendorf, Mari
    Hwang, Mei-Yuh
    Xie, Lei
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2404 - 2408
  • [13] Collaborative Training of Acoustic Encoders for Speech Recognition
    Nagaraja, Varun
    Shi, Yangyang
    Venkatesh, Ganesh
    Kalinli, Ozlem
    Seltzer, Michael L.
    Chandra, Vikas
    INTERSPEECH 2021, 2021, : 4573 - 4577
  • [14] Privacy Preserving Acoustic Model Training for Speech Recognition
    Tachioka, Yuuki
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 627 - 631
  • [15] Refinement of HMM Model Parameters for Punjabi Automatic Speech Recognition (PASR) System
    Kadyan, Virender
    Mantri, Archana
    Aggarwal, R. K.
    IETE JOURNAL OF RESEARCH, 2018, 64 (05) : 673 - 688
  • [16] Developing children's speech recognition system for low resource Punjabi language
    Kadyan, Virender
    Shanawazuddin, Syed
    Singh, Amitoj
    APPLIED ACOUSTICS, 2021, 178
  • [17] DNN based continuous speech recognition system of Punjabi language on Kaldi toolkit
    Guglani, Jyoti
    Mishra, A. N.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (01) : 41 - 45
  • [18] DNN based continuous speech recognition system of Punjabi language on Kaldi toolkit
    Jyoti Guglani
    A. N. Mishra
    International Journal of Speech Technology, 2021, 24 : 41 - 45
  • [19] Improving speech recognition using data augmentation and acoustic model fusion
    Rebai, Ilyes
    BenAyed, Yessine
    Mahdi, Walid
    Lorre, Jean-Pierre
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS, 2017, 112 : 316 - 322
  • [20] SGMM-Based Modeling Classifier for Punjabi Automatic Speech Recognition System
    Kadyan, Virender
    Kaur, Mandeep
    Advances in Intelligent Systems and Computing, 2020, 767 : 149 - 155