IMPROVEMENTS TO FILTERBANK AND DELTA LEARNING WITHIN A DEEP NEURAL NETWORK FRAMEWORK

被引:0
|
作者
Sainath, Tara N. [1 ]
Kingsbury, Brian [1 ]
Mohamed, Abdel-rahman
Saon, George [1 ]
Ramabhadran, Bhuvana [1 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
SPEECH RECOGNITION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Many features used in speech recognition tasks are hand-crafted and are not always related to the objective at hand, that is minimizing word error rate. Recently, we showed that replacing a perceptually motivated mel-filter bank with a filter bank layer that is learned jointly with the rest of a deep neural network was promising. In this paper, we extend filter learning to a speaker-adapted, state-of-the-art system. First, we incorporate delta learning into the filter learning framework. Second, we incorporate various speaker adaptation techniques, including VTLN warping and speaker identity features. On a 50-hour English Broadcast News task, we show that we can achieve a 5% relative improvement in word error rate (WER) using the filter and delta learning, compared to having a fixed set of filters and deltas. Furthermore, after speaker adaptation, we find that filter and delta learning allows for a 3% relative improvement in WER compared to a state-of-the-art CNN.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Multilingual deep learning framework for fake news detection using capsule neural network
    Rami Mohawesh
    Sumbal Maqsood
    Qutaibah Althebyan
    Journal of Intelligent Information Systems, 2023, 60 : 655 - 671
  • [22] Heterogeneous System Implementation of Deep Learning Neural Network for Object Detection in OpenCL Framework
    Li, Shuai
    Luo, Yukui
    Sun, Kuangyuan
    Choi, Ken
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 456 - 459
  • [23] A learning framework of modified deep recurrent neural network for classification and recognition of voice mood
    Agarwal, Gaurav
    Om, Hari
    Gupta, Sachi
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2022, 36 (08) : 1835 - 1859
  • [24] Dense Convolutional Neural Network Based Deep Learning Framework for the Diagnosis of Breast Cancer
    Kaur, Hardeep
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 132 (03) : 1765 - 1780
  • [25] A Framework for Extractive Text Summarization Based on Deep Learning Modified Neural Network Classifier
    Muthu, Balaanand
    Sivaparthipan, C. B.
    Kumar, Priyan Malarvizhi
    Kadry, Seifedine Nimer
    Hsu, Ching-Hsien
    Sanjuan, Oscar
    Gonzalez Crespo, Ruben
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (03)
  • [26] Dense Convolutional Neural Network Based Deep Learning Framework for the Diagnosis of Breast Cancer
    Hardeep Kaur
    Wireless Personal Communications, 2023, 132 : 1765 - 1780
  • [27] 3D Convolutional Neural Network Framework with Deep Learning for Nuclear Medicine
    Manimegalai, P.
    Suresh Kumar, R.
    Valsalan, Prajoona
    Dhanagopal, R.
    Vasanth Raj, P. T.
    Christhudass, Jerome
    SCANNING, 2022, 2022
  • [28] Multilingual deep learning framework for fake news detection using capsule neural network
    Mohawesh, Rami
    Maqsood, Sumbal
    Althebyan, Qutaibah
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2023, 60 (03) : 655 - 671
  • [29] Deep transform and metric learning network: Wedding deep dictionary learning and neural network
    Tang, Wen
    Chouzenoux, Emilie
    Pesquet, Jean-Christophe
    Krim, Hamid
    NEUROCOMPUTING, 2022, 509 : 244 - 256
  • [30] ConnectomeNet: A Unified Deep Neural Network Modeling Framework for Multi-Task Learning
    Lim, Heechul
    Chon, Kang-Wook
    Kim, Min-Soo
    IEEE ACCESS, 2023, 11 : 34297 - 34308