Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

被引：1

作者：

Su, Feng ^{[1
]}

Xue, Hao ^{[1
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China

来源：

MULTIMEDIA MODELING (MMM 2017), PT I | 2017年 / 10132卷

基金：

美国国家科学基金会;

关键词：

Music mood classification; Multimodal; Graph learning; Locality Preserving Projection; Bag of sentences; EMOTION CLASSIFICATION;

D O I：

10.1007/978-3-319-51811-4_13

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Automatic music mood classification is an important and challenging problem in the field of music information retrieval (MIR) and has attracted growing attention from variant research areas. In this paper, we proposed a novel multimodal method for music mood classification that exploits the complementarity of the lyrics and audio information of music to enhance the classification accuracy. We first extract descriptive sentence-level lyrics and audio features from the music. Then, we project the paired low-level features of two different modalities into a learned common discriminative latent space, which not only eliminates between modality heterogeneity, but also increases the discriminability of the resulting descriptions. On the basis of the latent representation of music, we employ a graph learning based multi-modal classification model for music mood, which takes the cross-modality similarity between local audio and lyrics descriptions of music into account for effective exploitation of correlations between different modalities. The acquired predictions of mood category for every sentence of music are then aggregated by a simple voting scheme. The effectiveness of the proposed method has been demonstrated in the experiments on a real dataset composed of more than 3,000 min of music and corresponding lyrics.

引用

页码：152 / 163

页数：12

共 50 条

[1] Multimodal Earth observation data fusion: Graph-based approach in shared latent space
Arun, P., V
Sadeh, R.
Avneri, A.
Tubul, Y.
Camino, C.
Buddhiraju, K. M.
Porwal, A.
Lati, R. N.
Zarco-Tejada, P. J.
Peleg, Z.
Herrmann, I
INFORMATION FUSION, 2022, 78 : 20 - 39
[2] Graph-based multimodal fusion with metric learning for multimodal classification
Angelou, Michalis
Solachidis, Vassilis
Vretos, Nicholas
Daras, Petros
PATTERN RECOGNITION, 2019, 95 : 296 - 307
[3] DISCRIMINATIVE GRAPH-BASED DIMENSIONALITY REDUCTION FOR HYPERSPECTRAL IMAGE CLASSIFICATION
Gu, Yanfeng
Wang, Qingwang
2016 8TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2016,
[4] Graph-based multimodal semi-supervised image classification
Xie, Wenxuan
Lu, Zhiwu
Peng, Yuxin
Xiao, Jianguo
NEUROCOMPUTING, 2014, 138 : 167 - 179
[5] Supervised classification using graph-based space partitioning
Yanev, Nicola
Valev, Ventzeslav
Krzyzak, Adam
Ben Suliman, Karima
PATTERN RECOGNITION LETTERS, 2019, 128 : 122 - 130
[6] Discriminative Graph-Based Fusion of HSI and LiDAR Data for Urban Area Classification
Gu, Yanfeng
Wang, Qingwang
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (06) : 906 - 910
[7] A framework for evaluating multimodal music mood classification
Hu, Xiao
Choi, Kahyun
Downie, J. Stephen
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2017, 68 (02) : 273 - 285
[8] Graph-Based Discriminative Learning for Location Recognition
Cao, Song
Snavely, Noah
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 112 (02) : 239 - 254
[9] Graph-Based Discriminative Learning for Location Recognition
Cao, Song
Snavely, Noah
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 700 - 707
[10] Graph-Based Discriminative Learning for Location Recognition
Song Cao
Noah Snavely
International Journal of Computer Vision, 2015, 112 : 239 - 254

← 1 2 3 4 5 →