Transfer learning through perturbation-based in-domain spectrogram augmentation for adult speech recognition

被引:0
|
作者
Kadyan, Virender [1 ]
Bawa, Puneet [2 ]
机构
[1] Speech and Language Research Centre, School of Computer Science, University of Petroleum & Energy Studies (UPES), Energy Acres, Bidholi, Uttarakhand, Dehradun,248007, India
[2] Centre of Excellence for Speech and Multimodal Laboratory, Chitkara University Institute of Engineering & Technology, Chitkara University, Punjab, Rajpura, India
来源
Neural Computing and Applications | 2022年 / 34卷 / 23期
关键词
Compilation and indexing terms; Copyright 2024 Elsevier Inc;
D O I
暂无
中图分类号
学科分类号
摘要
Automatic speech recognition system - Data augmentation - Data scarcity - Learning techniques - Overfitting - Pedagogical practices - Punjabi speech recognition - Spectrogram augmentation - Spectrograms - Transfer learning
引用
收藏
页码:21015 / 21033
相关论文
共 50 条
  • [21] Transfer learning from adult to children for speech recognition: Evaluation, analysis and recommendations
    Shivakumar, Prashanth Gurunath
    Georgiou, Panayiotis
    COMPUTER SPEECH AND LANGUAGE, 2020, 63
  • [22] Speech Emotion Recognition Based on Sparse Transfer Learning Method
    Song, Peng
    Zheng, Wenming
    Liang, Ruiyu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (07) : 1409 - 1412
  • [23] Improving Children's Speech Recognition through Explicit Pitch Scaling based on Iterative Spectrogram Inversion
    Ahmad, W.
    Shahnawazuddin, S.
    Kathania, H. K.
    Pradhan, G.
    Samaddar, A. B.
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2391 - 2395
  • [24] CycleGAN-based Emotion Style Transfer as Data Augmentation for Speech Emotion Recognition
    Bao, Fang
    Neumann, Michael
    Ngoc Thang Vu
    INTERSPEECH 2019, 2019, : 2828 - 2832
  • [25] Audio Augmentation for Non-Native Children's Speech Recognition through Discriminative Learning
    Radha, Kodali
    Bansal, Mohan
    ENTROPY, 2022, 24 (10)
  • [26] A domain-mismatch speech recognition system in radio communication based on improved spectrum augmentation
    Sun, Xiusong
    Yang, Qun
    Liu, Shaohan
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [27] Machine learning model based on the time domain regular perturbation-based theory for performance estimation in arbitrary heterogeneous optical links
    Ye, Xiaoyan
    Ghazisaeidi, Amirhossein
    OPTICAL FIBER TECHNOLOGY, 2025, 89
  • [28] SOURCE DOMAIN DATA SELECTION FOR IMPROVED TRANSFER LEARNING TARGETING DYSARTHRIC SPEECH RECOGNITION
    Xiong, Feifei
    Barker, Jon
    Yue, Zhengjun
    Christensen, Heidi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7424 - 7428
  • [29] Data augmentation and transfer learning for cross-lingual Named Entity Recognition in the biomedical domain
    Lancheros, Brayan Stiven
    Pastor, Gloria Corpas
    Mitkov, Ruslan
    LANGUAGE RESOURCES AND EVALUATION, 2024,
  • [30] Geological object recognition in legacy maps through data augmentation and transfer learning techniques
    Li, Wenjia
    Chen, Weilin
    Zhang, Jiyin
    Li, Chenhao
    Ma, Xiaogang
    APPLIED COMPUTING AND GEOSCIENCES, 2025, 25