Data selection by sequence summarizing neural network in mismatch condition training

被引:2
|
作者
Zmolikova, Katerina [1 ,2 ]
Karafiat, Martin [1 ,2 ]
Vesely, Karel [1 ,2 ]
Delcroix, Marc [3 ]
Watanabe, Shinji [4 ]
Burget, Lukas [1 ,2 ]
Cernocky, Jan Honza [1 ,2 ]
机构
[1] Brno Univ Technol, Speech FIT, Brno, Czech Republic
[2] IT4I Ctr Excellence, Brno, Czech Republic
[3] NTT Corp, NTT Commun Sci Labs, Kyoto, Japan
[4] MERL, Cambridge, MA USA
关键词
Automatic speech recognition; Data augmentation; Data selection; Mismatch training condition; Sequence summarization;
D O I
10.21437/Interspeech.2016-741
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these "summary-vectors" were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.
引用
收藏
页码:2354 / 2358
页数:5
相关论文
共 50 条
  • [41] Training data requirement for a neural network to predict aerodynamic coefficients
    Rajkumar, T
    Bardina, J
    INDEPENDENT COMPONENT ANALYSES, WAVELETS, AND NEURAL NETWORKS, 2003, 5102 : 92 - 103
  • [42] Efficient I/O for Neural Network Training with Compressed Data
    Zhang, Zhao
    Huang, Lei
    Pauloski, J. Gregory
    Foster, Ian T.
    2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020, 2020, : 409 - 418
  • [43] The Use Of Synthetic Data For Training The Neural Network To Classify The Aircrafts
    Below, A. N.
    Zarubin, A. A.
    Savelieva, A. A.
    Tarlykov, A., V
    2019 SYSTEMS OF SIGNALS GENERATING AND PROCESSING IN THE FIELD OF ON BOARD COMMUNICATIONS, 2019,
  • [44] Gist: Efficient Data Encoding for Deep Neural Network Training
    Jain, Animesh
    Phanishayee, Amar
    Mars, Jason
    Tang, Lingjia
    Pekhimenko, Gennady
    2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, : 776 - 789
  • [45] Efficient partition of learning data sets for neural network training
    Inst. of Bioorg. and Petrol. Chem., Kiev, Ukraine
    不详
    Neural Netw., 8 (1361-1374):
  • [46] Physical Layer Neural Network Framework for Training Data Formation
    McClintick, Kyle W.
    Wyglinski, Alexander M.
    2018 IEEE 88TH VEHICULAR TECHNOLOGY CONFERENCE (VTC-FALL), 2018,
  • [47] Training a convolutional neural network to conserve mass in data assimilation
    Ruckstuhl, Yvonne
    Janjic, Tijana
    Rasp, Stephan
    NONLINEAR PROCESSES IN GEOPHYSICS, 2021, 28 (01) : 111 - 119
  • [48] Rule Extraction from Training Data Using Neural Network
    Biswas, Saroj Kumar
    Chakraborty, Manomita
    Purkayastha, Biswajit
    Roy, Pinki
    Thounaojam, Dalton Meitei
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (03)
  • [49] Efficient partition of learning data sets for neural network training
    Tetko, IV
    Villa, AEP
    NEURAL NETWORKS, 1997, 10 (08) : 1361 - 1374
  • [50] Reducing Neural Network Training Data using Support Vectors
    Dahiya, Kalpana
    Sharma, Anuj
    2014 RECENT ADVANCES IN ENGINEERING AND COMPUTATIONAL SCIENCES (RAECS), 2014,