Music genre classification based on fusing audio and lyric information

被引:0
|
作者
You Li
Zhihai Zhang
Han Ding
Liang Chang
机构
[1] Guilin University of Electronic Technology,Guangxi Key Laboratory of Trusted Software
[2] Guilin University of Electronic Technology,School of Electronic Engineering and Automation
来源
关键词
Music genre classification; Audio information; Lyric information; Information fusion;
D O I
暂无
中图分类号
学科分类号
摘要
Music genre classification (MGC) has a wide range of application scenarios. Traditional MGC methods only consider either audio information or lyric information, resulting in an unsatisfactory recognition effect. In this paper, we propose a multimodal music genre classification framework that integrates both audio information and lyric information. By using the complementarity of multimodal information, music genres can be represented more comprehensively. First, the framework extracts the mel-spectrogram of audio, and a convolutional neural network is used to extract audio features. Simultaneously, BERT is used to obtain the distributed representation of the lyrics. Then, the two modal pieces of information are fused through different strategies, such as at the feature level and decision level. To solve the serious inconsistency between the convergence speed of the audio channel and the lyric channel, we adopt the strategy of asynchronous start training of two channels and different learning rates. A series of experiments are carried out to verify the effectiveness of the proposed model. The F1 score of the proposed model is 0.87 for music genre classification, which is approximately 4% higher than that of the best baseline in the experiment.
引用
收藏
页码:20157 / 20176
页数:19
相关论文
共 50 条
  • [31] GenreNet: A Deep Based Approach for Music Genre Classification
    N. Bala Ganesh
    M. S. Bhuvaneswari
    K. Bhagavathi Sankar
    P. Ganesh
    SN Computer Science, 5 (8)
  • [32] MUSIC GENRE CLASSIFICATION BASED ON MULTIPLE CLASSIFIER FUSION
    Wang, Lei
    Huang, Shen
    Wang, Shijin
    Liang, Jiaen
    Xu, Bo
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 5, PROCEEDINGS, 2008, : 580 - 583
  • [33] Proactive Caching of Music Videos based on Audio Features, Mood, and Genre
    Koch, Christian
    Krupii, Ganna
    Hausheer, David
    PROCEEDINGS OF THE 8TH ACM MULTIMEDIA SYSTEMS CONFERENCE (MMSYS'17), 2017, : 100 - 111
  • [34] Music Features based on Hu Moments for Genre Classification
    Lopes, Renia
    Chapaneri, Santosh
    Jayaswal, Deepak
    2017 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS, COMPUTING AND IT APPLICATIONS (CSCITA), 2017, : 22 - 27
  • [35] A Music Genre Classification Method Based on Deep Learning
    He, Qi
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [36] Music Genre Classification Based on Functional Data Analysis
    Shen, Jiahong
    Xiao, Guangrun
    IEEE ACCESS, 2024, 12 : 185482 - 185491
  • [37] FAC: A Music Recommendation Model Based on Fusing Audio and Chord features (115)
    Feng, Weite
    Liu, Junrui
    Li, Tong
    Yang, Zhen
    Wu, Di
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2022, 32 (11N12) : 1753 - 1770
  • [38] Music genre determination using audio fingerprinting
    Herkiloglu, Kadir
    Gursoy, Ozan
    Gunsel, Bilge
    2006 IEEE 14TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1 AND 2, 2006, : 305 - +
  • [39] On efficient music genre classification
    Shen, JL
    Shepherd, J
    Ngu, AHH
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2005, 3453 : 253 - 264
  • [40] Music genre classification with taxonomy
    Li, T
    Ogihara, M
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 197 - 200