Training augmentation with TANDEM acoustic modelling in Punjabi adult speech recognition system

被引:0
|
作者
Virender Kadyan
Shashi Bala
Puneet Bawa
机构
[1] University of Petroleum & Energy Studies (UPES),Department of Informatics, School of Computer Science
[2] Chitkara University Institute of Engineering and Technology,Centre of Excellence for Speech and Multimodal Laboratory
[3] Chitkara University,undefined
关键词
Tandem-NN; Data augmentation; Bottleneck features; Punjabi ASR; DNN-HMM;
D O I
暂无
中图分类号
学科分类号
摘要
Processing of low resource pre and post acoustic signals always faced the challenge of data scarcity in its training module. It’s difficult to obtain high system accuracy with limited corpora in train set which results into extraction of large discriminative feature vector. These vectors information are distorted due to acoustic mismatch occurs because of real environment and inter speaker variations. In this paper, context independent information of an input speech signal is pre-processed using bottleneck features and later in modeling phase Tandem-NN model has been employ to enhance system accuracy. Later to fulfill the requirement of train data issues, in-domain training augmentation is perform using fusion of original clean and artificially created modified train noisy data and to further boost this training data, tempo modification of input speech signal is perform with maintenance of its spectral envelope and pitch in corresponding input audio signal. Experimental result shows that a relative improvement of 13.53% is achieved in clean and 32.43% in noisy conditions with Tandem-NN system in comparison to that of baseline system respectively.
引用
收藏
页码:473 / 481
页数:8
相关论文
共 50 条
  • [21] Training of Automatic Speech Recognition System on Noised Speech
    Prodeus, Arkadiy
    Kukharicheva, Kateryna
    2016 4TH INTERNATIONAL CONFERENCE ON METHODS AND SYSTEMS OF NAVIGATION AND MOTION CONTROL (MSNMC), 2016, : 221 - 223
  • [22] ACOUSTIC PROCESSOR IN A CONVERSATIONAL SPEECH RECOGNITION SYSTEM
    NAKATSU, R
    KOHDA, M
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1978, 26 (11-1): : 1486 - 1504
  • [23] Acoustic Modelling for Speech Recognition: Hidden Markov Models and Beyond?
    Gales, M. J. F.
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 44 - 44
  • [24] Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors
    Wang, Longshaokan
    Fazel-zarandi, Maryam
    Tiwari, Aditya
    Matsoukas, Spyros
    Polymenakos, Lazaros
    NLP FOR CONVERSATIONAL AI, 2020, : 63 - 70
  • [25] Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling
    Thimmaraja Yadava, G.
    Jayanna, H. S.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (01) : 149 - 167
  • [26] Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling
    G. Thimmaraja Yadava
    H. S. Jayanna
    International Journal of Speech Technology, 2020, 23 : 149 - 167
  • [27] Automatic speech recognition system with pitch dependent features for Punjabi language on KALDI toolkit
    Guglani, Jyoti
    Mishra, A. N.
    APPLIED ACOUSTICS, 2020, 167
  • [28] Acoustic model training for speech recognition over mobile networks
    Vojtko, Juraj
    Kacur, Juraj
    Rozinaj, Gregor
    Korosi, Jan
    INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2013, 6 (02) : 65 - 74
  • [29] Speaker-Independent Automatic Speech Recognition System for Mobile Phone Applications in Punjabi
    Mittal, Puneet
    Singh, Navdeep
    ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2018, 678 : 369 - 382
  • [30] Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
    Zheng, Xianrui
    Zhang, Chao
    Woodland, Phil C.
    INTERSPEECH 2022, 2022, : 3844 - 3848