PHONETICS EMBEDDING LEARNING WITH SIDE INFORMATION

被引:0
|
作者
Synnaeve, Gabriel [1 ]
Schatz, Thomas [1 ,2 ]
Dupoux, Emmanuel [1 ]
机构
[1] CNRS, EHESS, IEC ENS, LSCP, Paris, France
[2] CNRS, ENS, SIERRA Project Team INRIA, Paris, France
关键词
speech; ABX; deep neural network; side information; semi-supervised; speech embeddings; acoustic model; DISCOVERY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We show that it is possible to learn an efficient acoustic model using only a small amount of easily available word-level similarity annotations. In contrast to the detailed phonetic labeling required by classical speech recognition technologies, the only information our method requires are pairs of speech excerpts which are known to be similar (same word) and pairs of speech excerpts which are known to be different (different words). An acoustic model is obtained by training shallow and deep neural networks, using an architecture and a cost function well-adapted to the nature of the provided information. The resulting model is evaluated in an ABX minimalpair discrimination task and is shown to perform much better (11.8% ABX error rate) than raw speech features (19.6%), not far from a fully supervised baseline (best neural network: 9.2%, HMM-GMM: 11%).
引用
收藏
页码:106 / 111
页数:6
相关论文
共 50 条
  • [31] Learning Discriminative Recommendation Systems with Side Information
    Zhao, Feipeng
    Guo, Yuhong
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3469 - 3475
  • [32] Partial label learning with noisy side information
    Shaokai Wang
    Mingxuan Xia
    Zilong Wang
    Gengyu Lyu
    Songhe Feng
    Applied Intelligence, 2022, 52 : 12382 - 12396
  • [33] The phonetics of information structure in Yoloxochitl Mixtec
    DiCanio, Christian
    Benn, Joshua
    Castillo Garcia, Rey
    JOURNAL OF PHONETICS, 2018, 68 : 50 - 68
  • [34] Side information embedding scheme for PTS based PAPR reduction in OFDM systems
    Goel, Ashish
    Gupta, Saruti
    ALEXANDRIA ENGINEERING JOURNAL, 2022, 61 (12) : 11765 - 11777
  • [35] Representation Learning for Heterogeneous Information Networks via Embedding Events
    Fu, Guoji
    Yuan, Bo
    Duan, Qiqi
    Yao, Xin
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 : 327 - 339
  • [36] Structure information learning for neutral links in signed network embedding
    Cai, Shensheng
    Shan, Wei
    Zhang, Mingli
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (03)
  • [37] Graph Embedding Learning for Cross-Modal Information Retrieval
    Zhang, Youcai
    Gu, Xiaodong
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 594 - 601
  • [38] More Information Supervised Probabilistic Deep Face Embedding Learning
    Huang, Ying
    Qiu, Shangfeng
    Zhang, Wenwei
    Luo, Xianghui
    Wang, Jinzhuo
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [39] Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks
    Shi, Yu
    Zhu, Qi
    Guo, Fang
    Zhang, Chao
    Han, Jiawei
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 2190 - 2199
  • [40] Global Information Embedding Network for Few-Shot Learning
    Feng, Rui
    Ji, Hongbing
    Zhu, Zhigang
    Wang, Lei
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 501 - 505