EXTRACTING DEEP BOTTLENECK FEATURES USING STACKED AUTO-ENCODERS

被引:0
|
作者
Gehring, Jonas [1 ]
Miao, Yajie [2 ]
Metze, Florian [2 ]
Waibel, Alex [1 ,2 ]
机构
[1] Karlsruhe Inst Technol, Interact Syst Lab, D-76021 Karlsruhe, Germany
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
Bottleneck features; Deep learning; Auto-encoders; NETWORKS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, a novel training scheme for generating bottleneck features from deep neural networks is proposed. A stack of denoising auto-encoders is first trained in a layer-wise, unsupervised manner. Afterwards, the bottleneck layer and an additional layer are added and the whole network is fine-tuned to predict target phoneme states. We perform experiments on a Cantonese conversational telephone speech corpus and find that increasing the number of auto-encoders in the network produces more useful features, but requires pre-training, especially when little training data is available. Using more unlabeled data for pre-training only yields additional gains. Evaluations on larger datasets and on different system setups demonstrate the general applicability of our approach. In terms of word error rate, relative improvements of 9.2% (Cantonese, ML training), 9.3% (Tagalog, BMMI-SAT training), 12% (Tagalog, confusion network combinations with MFCCs), and 8.7% (Switchboard) are achieved.
引用
收藏
页码:3377 / 3381
页数:5
相关论文
共 50 条
  • [21] Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction
    Masci, Jonathan
    Meier, Ueli
    Ciresan, Dan
    Schmidhuber, Juergen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT I, 2011, 6791 : 52 - 59
  • [22] Emotion Detection using Visual Information with Deep Auto-Encoders
    Bairaju, Siva Prasad Raju
    Ari, Sowmya
    Garimella, Rama Murthy
    2019 IEEE 5TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2019,
  • [23] Real-time transient stability assessment using stacked auto-encoders
    Azarbik, Masoud
    Sarlak, Mostafa
    COMPEL-THE INTERNATIONAL JOURNAL FOR COMPUTATION AND MATHEMATICS IN ELECTRICAL AND ELECTRONIC ENGINEERING, 2020, 39 (04) : 971 - 990
  • [24] Identification of cell pathology by Using Stacked Auto-Encoders Combination with Rotation Forest
    Liu, MengLin
    Yan, Xin
    Wang, Lei
    2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 261 - 265
  • [25] Unsupervised 3D Motion Summarization Using Stacked Auto-Encoders
    Protopapadakis, Eftychios
    Rallis, Ioannis
    Doulamis, Anastasios
    Doulamis, Nikolaos
    Voulodimos, Athanasios
    APPLIED SCIENCES-BASEL, 2020, 10 (22): : 1 - 20
  • [26] EXTRACTING STRUCTURAL SPECTRAL FEATURES USING WHAT-WHERE AUTO-ENCODERS FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS
    Hu, Ya-Jun
    Ling, Zhen-Hua
    Dai, Li-Rong
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4915 - 4919
  • [27] Smile Recognition Based on Deep Auto-Encoders
    Liang, Shufen
    Liang, Xiangqun
    Guo, Min
    2015 11TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2015, : 176 - 181
  • [28] Fisher Auto-Encoders
    Elkhalil, Khalil
    Hasan, Ali
    Ding, Jie
    Farsiu, Sina
    Tarokh, Vahid
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 352 - 360
  • [29] Ornstein Auto-Encoders
    Choi, Youngwon
    Won, Joong-Ho
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2172 - 2178
  • [30] Transforming Auto-Encoders
    Hinton, Geoffrey E.
    Krizhevsky, Alex
    Wang, Sida D.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT I, 2011, 6791 : 44 - 51