Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

被引:1
|
作者
Su, Feng [1 ]
Xue, Hao [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
来源
基金
美国国家科学基金会;
关键词
Music mood classification; Multimodal; Graph learning; Locality Preserving Projection; Bag of sentences; EMOTION CLASSIFICATION;
D O I
10.1007/978-3-319-51811-4_13
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic music mood classification is an important and challenging problem in the field of music information retrieval (MIR) and has attracted growing attention from variant research areas. In this paper, we proposed a novel multimodal method for music mood classification that exploits the complementarity of the lyrics and audio information of music to enhance the classification accuracy. We first extract descriptive sentence-level lyrics and audio features from the music. Then, we project the paired low-level features of two different modalities into a learned common discriminative latent space, which not only eliminates between modality heterogeneity, but also increases the discriminability of the resulting descriptions. On the basis of the latent representation of music, we employ a graph learning based multi-modal classification model for music mood, which takes the cross-modality similarity between local audio and lyrics descriptions of music into account for effective exploitation of correlations between different modalities. The acquired predictions of mood category for every sentence of music are then aggregated by a simple voting scheme. The effectiveness of the proposed method has been demonstrated in the experiments on a real dataset composed of more than 3,000 min of music and corresponding lyrics.
引用
收藏
页码:152 / 163
页数:12
相关论文
共 50 条
  • [1] Multimodal Earth observation data fusion: Graph-based approach in shared latent space
    Arun, P., V
    Sadeh, R.
    Avneri, A.
    Tubul, Y.
    Camino, C.
    Buddhiraju, K. M.
    Porwal, A.
    Lati, R. N.
    Zarco-Tejada, P. J.
    Peleg, Z.
    Herrmann, I
    INFORMATION FUSION, 2022, 78 : 20 - 39
  • [2] Graph-based multimodal fusion with metric learning for multimodal classification
    Angelou, Michalis
    Solachidis, Vassilis
    Vretos, Nicholas
    Daras, Petros
    PATTERN RECOGNITION, 2019, 95 : 296 - 307
  • [3] DISCRIMINATIVE GRAPH-BASED DIMENSIONALITY REDUCTION FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Gu, Yanfeng
    Wang, Qingwang
    2016 8TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2016,
  • [4] Graph-based multimodal semi-supervised image classification
    Xie, Wenxuan
    Lu, Zhiwu
    Peng, Yuxin
    Xiao, Jianguo
    NEUROCOMPUTING, 2014, 138 : 167 - 179
  • [5] Supervised classification using graph-based space partitioning
    Yanev, Nicola
    Valev, Ventzeslav
    Krzyzak, Adam
    Ben Suliman, Karima
    PATTERN RECOGNITION LETTERS, 2019, 128 : 122 - 130
  • [6] Discriminative Graph-Based Fusion of HSI and LiDAR Data for Urban Area Classification
    Gu, Yanfeng
    Wang, Qingwang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (06) : 906 - 910
  • [7] A framework for evaluating multimodal music mood classification
    Hu, Xiao
    Choi, Kahyun
    Downie, J. Stephen
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2017, 68 (02) : 273 - 285
  • [8] Graph-Based Discriminative Learning for Location Recognition
    Cao, Song
    Snavely, Noah
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 112 (02) : 239 - 254
  • [9] Graph-Based Discriminative Learning for Location Recognition
    Cao, Song
    Snavely, Noah
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 700 - 707
  • [10] Graph-Based Discriminative Learning for Location Recognition
    Song Cao
    Noah Snavely
    International Journal of Computer Vision, 2015, 112 : 239 - 254