Toward Multi-modal Music Emotion Classification

被引:0
|
作者
Yang, Yi-Hsuan [1 ]
Lin, Yu-Ching [1 ]
Cheng, Heng-Tze [1 ]
Liao, I-Bin [2 ]
Ho, Yeh-Chin [2 ]
Chen, Homer H. [1 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
[2] Chunghwa Telecom, Telecommun Labs, Taipei, Taiwan
关键词
Music emotion recognition; multi-modal fusion; lyrics; natural language processing; probabilistic latent semantic analysis;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The performance of categorical music emotion classification that divides emotion into classes and uses audio features alone for emotion classification has reached a limit due to the presence of a semantic gap between the object feature level and the human cognitive level of emotion perception. Motivated by the fact that lyrics carry rich semantic information of a song, we propose a multi-modal approach to help improve categorical music emotion classification. By exploiting both the audio features and the lyrics of a song, the proposed approach improves the 4-class emotion classification accuracy from 46.6% to 57.1%. The results also show that the incorporation of lyrics significantly enhances the classification accuracy of valence.
引用
收藏
页码:70 / +
页数:3
相关论文
共 50 条
  • [41] Cross-Modal Retrieval Augmentation for Multi-Modal Classification
    Gur, Shir
    Neverova, Natalia
    Stauffer, Chris
    Lim, Ser-Nam
    Kiela, Douwe
    Reiter, Austin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 111 - 123
  • [42] A multi-modal emotion fusion classification method combined expression and speech based on attention mechanism
    Dong Liu
    Longxi Chen
    Lifeng Wang
    Zhiyong Wang
    Multimedia Tools and Applications, 2022, 81 : 41677 - 41695
  • [43] Improved Sentiment Classification by Multi-modal Fusion
    Gan, Lige
    Benlamri, Rachid
    Khoury, Richard
    2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 11 - 16
  • [44] Multi-modal classification in digital news libraries
    Chen, MY
    Hauptmann, A
    JCDL 2004: PROCEEDINGS OF THE FOURTH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES: GLOBAL REACH AND DIVERSE IMPACT, 2004, : 212 - 213
  • [45] A Multi-modal SPM Model for Image Classification
    Zheng, Peng
    Zhao, Zhong-Qiu
    Gao, Jun
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2017, PT III, 2017, 10363 : 525 - 535
  • [46] Multi-modal Learning for Social Image Classification
    Liu, Chunyang
    Zhang, Xu
    Li, Xiong
    Li, Rui
    Zhang, Xiaoming
    Chao, Wenhan
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1174 - 1179
  • [47] Contextual and Cross-Modal Interaction for Multi-Modal Speech Emotion Recognition
    Yang, Dingkang
    Huang, Shuai
    Liu, Yang
    Zhang, Lihua
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2093 - 2097
  • [48] Multi-modal emotion identification fusing facial expression and EEG
    Yongzhen Wu
    Jinhua Li
    Multimedia Tools and Applications, 2023, 82 : 10901 - 10919
  • [49] Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding
    Jia, Ao
    He, Yu
    Zhang, Yazhou
    Uprety, Sagar
    Song, Dawei
    Lioma, Christina
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1512 - 1522
  • [50] WEMAC: Women and Emotion Multi-modal Affective Computing dataset
    Miranda Calero, Jose A.
    Gutierrez-Martin, Laura
    Rituerto-Gonzalez, Esther
    Romero-Perales, Elena
    Lanza-Gutierrez, Jose M.
    Pelaez-Moreno, Carmen
    Lopez-Ongil, Celia
    SCIENTIFIC DATA, 2024, 11 (01)