DERIVING SPECTRO-TEMPORAL PROPERTIES OF HEARING FROM SPEECH DATA

被引:0
|
作者
Ondel, Lucas [1 ,3 ]
Li, Ruizhi [1 ]
Sell, Gregory [1 ,2 ]
Hermansky, Hynek [1 ,2 ,3 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD USA
[3] Brno Univ Technol, FIT, Ctr Excellence IT4I, Brno, Czech Republic
基金
美国国家科学基金会;
关键词
perception; spectro-temporal; auditory; deep learning; RECOGNITION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Human hearing and human speech are intrinsically tied together, as the properties of speech almost certainly developed in order to be heard by human ears. As a result of this connection, it has been shown that certain properties of human hearing are mimicked within data-driven systems that are trained to understand human speech. In this paper, we further explore this phenomenon by measuring the spectro-temporal responses of data-derived filters in a front-end convolutional layer of a deep network trained to classify the phonemes of clean speech. The analyses show that the filters do indeed exhibit spectro-temporal responses similar to those measured in mammals, and also that the filters exhibit an additional level of frequency selectivity, similar to the processing pipeline assumed within the Articulation Index.
引用
收藏
页码:411 / 415
页数:5
相关论文
共 50 条
  • [31] DeepCNN: Spectro-temporal feature representation for speech emotion recognition
    Saleem, Nasir
    Gao, Jiechao
    Irfan, Rizwana
    Almadhor, Ahmad
    Rauf, Hafiz Tayyab
    Zhang, Yudong
    Kadry, Seifedine
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (02) : 401 - 417
  • [32] Evaluation of spatial and spectro-temporal cues in release from masking in normal hearing listeners
    Sanayi, Roya
    Ghassem, Mohammadakhani
    Nematollah, Rouhbakhsh
    Shohreh, Jalaie
    Salar, Jafarpisheh Amir
    Hamid, Jalilvand
    APPLIED ACOUSTICS, 2021, 173
  • [33] Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain
    Esfandian, N.
    INTERNATIONAL JOURNAL OF ENGINEERING, 2020, 33 (01): : 105 - 111
  • [34] A broad survey of spectro-temporal properties from FRB 20121102A
    Chamma, Mohammed A.
    Rajabi, Fereshteh
    Kumar, Aishwarya
    Houde, Martin
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2023, 522 (02) : 3036 - 3048
  • [35] Improvement and Assessment of Spectro-Temporal Modulation Analysis for Speech Intelligibility Estimation
    Edraki, Amin
    Chan, Wai-Yip
    Jensen, Jesper
    Fogerty, Daniel
    INTERSPEECH 2019, 2019, : 1378 - 1382
  • [36] Multi-Stream Spectro-Temporal Features for Robust Speech Recognition
    Zhao, Sherry Y.
    Morgan, Nelson
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 898 - 901
  • [37] General properties of auditory spectro-temporal receptive fields
    Mahajan, Nagaraj R.
    Mesgarani, Nima
    Hermansky, Hynek
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 146 (06): : EL459 - EL463
  • [38] Multi-sensor spectro-temporal comb filtering for speech enhancement
    Demiroglu, Cenk
    Anderson, David V.
    Clements, Mark. A.
    Barnwell, Thomas
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 589 - +
  • [39] POINT PROCESS MODELS OF SPECTRO-TEMPORAL MODULATION EVENTS FOR SPEECH RECOGNITION
    Jansen, Aren
    Mesgarani, Nima
    Niyogi, Partha
    2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 104 - 108
  • [40] Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features
    Schubotz, Wiebke
    Brand, Thomas
    Kollmeier, Birger
    Ewert, Stephan D.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 140 (01): : 524 - 540