EXTRACTING DEEP BOTTLENECK FEATURES USING STACKED AUTO-ENCODERS

被引:0
|
作者
Gehring, Jonas [1 ]
Miao, Yajie [2 ]
Metze, Florian [2 ]
Waibel, Alex [1 ,2 ]
机构
[1] Karlsruhe Inst Technol, Interact Syst Lab, D-76021 Karlsruhe, Germany
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
Bottleneck features; Deep learning; Auto-encoders; NETWORKS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, a novel training scheme for generating bottleneck features from deep neural networks is proposed. A stack of denoising auto-encoders is first trained in a layer-wise, unsupervised manner. Afterwards, the bottleneck layer and an additional layer are added and the whole network is fine-tuned to predict target phoneme states. We perform experiments on a Cantonese conversational telephone speech corpus and find that increasing the number of auto-encoders in the network produces more useful features, but requires pre-training, especially when little training data is available. Using more unlabeled data for pre-training only yields additional gains. Evaluations on larger datasets and on different system setups demonstrate the general applicability of our approach. In terms of word error rate, relative improvements of 9.2% (Cantonese, ML training), 9.3% (Tagalog, BMMI-SAT training), 12% (Tagalog, confusion network combinations with MFCCs), and 8.7% (Switchboard) are achieved.
引用
收藏
页码:3377 / 3381
页数:5
相关论文
共 50 条
  • [31] Aspect Extraction from Bangla Reviews Through Stacked Auto-Encoders
    Bodini, Matteo
    DATA, 2019, 4 (03)
  • [32] Structural damage identification incorporating transmissibility functions with stacked auto-encoders
    Fang, Sheng-En
    Liu, Yang
    Zhang, Xiao-Hua
    Zhendong Gongcheng Xuebao/Journal of Vibration Engineering, 2024, 37 (09): : 1460 - 1467
  • [33] A Noisy Sparse Convolution Neural Network Based on Stacked Auto-encoders
    Ding, Yulin
    Zhang, Xiaolong
    Tang, Jinshan
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 3457 - 3461
  • [34] Deep variational auto-encoders for unsupervised glomerular classification
    Lutnick, Brendon
    Yacoub, Rabi
    Jen, Kuang-Yu
    Tomaszewski, John E.
    Jain, Sanjay
    Sarder, Pinaki
    MEDICAL IMAGING 2018: DIGITAL PATHOLOGY, 2018, 10581
  • [35] Feature Abstraction for Driver Behaviour Detection with Stacked Sparse Auto-encoders
    Camlica, Zehra
    Hilal, Allaa
    Kulic, Dana
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 3299 - 3304
  • [36] An Experimental Study on Hyper-parameter Optimization for Stacked Auto-Encoders
    Sun, Yanan
    Xue, Bing
    Zhang, Mengjie
    Yen, Gary G.
    2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 638 - 645
  • [37] Effective Multi-Modal Retrieval based on Stacked Auto-Encoders
    Wang, Wei
    Ooi, Beng Chin
    Yang, Xiaoyan
    Zhang, Dongxiang
    Zhuang, Yueting
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (08): : 649 - 660
  • [38] Stacked Progressive Auto-Encoders for Clothing-Invariant Gait Recognition
    Yeoh, TzeWei
    Aguirre, Hernan E.
    Tanaka, Kiyoshi
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS: 17TH INTERNATIONAL CONFERENCE, CAIP 2017, PT II, 2017, 10425 : 151 - 161
  • [39] An efficient deep model for day-ahead electricity load forecasting with stacked denoising auto-encoders
    Tong, Chao
    Li, Jun
    Lang, Chao
    Kong, Fanxin
    Niu, Jianwei
    Rodrigues, Joel J. P. C.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2018, 117 : 267 - 273
  • [40] Collaborative Stacked Denoising Auto-Encoders for Refining Student Performance Data
    Fan, Ye
    Sun, Yuan
    Ye, Shiwei
    Liao, Pan
    Su, Guiping
    Sun, Yi
    2018 5TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, AND SOCIO-CULTURAL COMPUTING (BESC), 2018, : 67 - 72