Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition

被引:5
|
作者
Li, Xingfeng [1 ]
Shi, Xiaohan [2 ]
Hu, Desheng [3 ]
Li, Yongwei [4 ]
Zhang, Qingchen [1 ]
Wang, Zhengxia [5 ]
Unoki, Masashi [6 ]
Akagi, Masato [6 ]
机构
[1] Hainan Univ, Grad Sch Comp Sci & Technol, Haikou 570288, Peoples R China
[2] Nagoya Univ, Sch Informat Sci, Nagoya 4648601, Japan
[3] Taiyuan Univ Technol, Coll Informat & Comp, Taiyuan 030024, Peoples R China
[4] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
[5] Hainan Univ, Sch Comp Sci & Technol, Haikou 570288, Peoples R China
[6] Japan Adv Inst Sci & Technol, Sch Informat Sci, Nomi 9231292, Japan
基金
中国国家自然科学基金;
关键词
Affective computing; speech emotion recognition; acoustic representation; music theory and speech analysis; PERCEPTION; EXPRESSION; PATTERNS; FEATURES; PITCH; PERSPECTIVE; MODALITIES; KNOWLEDGE; INTERVALS; COGNITION;
D O I
10.1109/TASLP.2023.3289312
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This research presents a music theory-inspired acoustic representation (hereafter, MTAR) to address improved speech emotion recognition. The recognition of emotion in speech and music is developed in parallel, yet a relatively limited understanding of MTAR for interpreting speech emotions is involved. In the present study, we use music theory to study representative acoustics associated with emotion in speech from vocal emotion expressions and auditory emotion perception domains. In experiments assessing the role and effectiveness of the proposed representation in classifying discrete emotion categories and predicting continuous emotion dimensions, it shows promising performance compared with extensively used features for emotion recognition based on the spectrogram, Mel-spectrogram, Mel-frequency cepstral coefficients, VGGish, and the large baseline feature sets of the INTERSPEECH challenges. This proposal opens up a novel research avenue in developing a computational acoustic representation of speech emotion via music theory.
引用
收藏
页码:2534 / 2547
页数:14
相关论文
共 50 条
  • [1] BIOLOGICALLY INSPIRED SPEECH EMOTION RECOGNITION
    Lotjidereshgi, Reza
    Gournay, Philippe
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5135 - 5139
  • [2] Quantum Theory-Inspired Search
    Aerts, Diederik
    Bruza, Peter
    Hou, Yuexian
    Jose, Joemon
    Melucci, Massimo
    Nie, Jian-Yun
    Song, Dawei
    PROCEEDINGS OF THE 2ND EUROPEAN FUTURE TECHNOLOGIES CONFERENCE AND EXHIBITION 2011 (FET 11), 2011, 7 : 278 - 280
  • [3] Representation Learning for Speech Emotion Recognition
    Ghosh, Sayan
    Laksana, Eugene
    Morency, Louis-Philippe
    Scherer, Stefan
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3603 - 3607
  • [4] Feature representation for speech emotion Recognition
    Abdollahpour, Mehdi
    Zamani, Lafar
    Rad, Hamidreza Saligheh
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1465 - 1468
  • [5] Biologically inspired emotion recognition from speech
    Caponetti, Laura
    Buscicchio, Cosimo Alessandro
    Castellano, Giovanna
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,
  • [6] Biologically inspired emotion recognition from speech
    Laura Caponetti
    Cosimo Alessandro Buscicchio
    Giovanna Castellano
    EURASIP Journal on Advances in Signal Processing, 2011
  • [7] The time course of emotion recognition in speech and music
    Nordstrom, Henrik
    Laukka, Petri
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (05): : 3058 - 3074
  • [8] Speech Emotion Recognition Based on Sparse Representation
    Yan, Jingjie
    Wang, Xiaolan
    Gu, Weiyi
    Ma, Lili
    ARCHIVES OF ACOUSTICS, 2013, 38 (04) : 465 - 470
  • [9] Speech recognition inspired features for acoustic emission
    University of Augsburg, Augsburg, Germany
    eJ. Nondestruct. Test., 2024, 10
  • [10] Acoustic-Prosodic Recognition of Emotion in Speech
    Montenegro, Chuchi S.
    Maravillas, Elmer A.
    2015 INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY,COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2015, : 527 - +