Music style classification by jointly using CNN and Transformer

被引:0
|
作者
Tang, Rui [1 ]
Qi, Miao [1 ]
Wang, Qingnan [1 ]
机构
[1] Northeast Normal Univ, Coll Informat Sci & Technol, Changchun 130117, Peoples R China
关键词
Music style; Audio classification; CNN; Transformer;
D O I
10.1145/3651671.3651696
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Music influences people in many ways and plays an important role in human life from emotional expression to social interaction to cognitive development. However, the variety of musical styles is often difficult to distinguish. In this paper, different from existing methods that music presented in the form of audio information can be classified as a sequence of features divided by time through RNN or LSTM, a novel music style classification method is proposed by transforming music audio into audio image. Moreover, Convolutional Neural Network (CNN) and Transformer are combined to jointly extract rich audio image features for music style classification. The effectiveness of the proposed method is evaluated by a large number of ablation and comparative experiments. The experimental results demonstrate that the classification accuracy of our proposed method can achieve satisfactory classification accuracy and is better than some existing classification methods on GTZAN dataset.
引用
收藏
页码:707 / 712
页数:6
相关论文
共 50 条
  • [31] CTransCNN: Combining transformer and CNN in multilabel medical image classification
    Wu, Xin
    Feng, Yue
    Xu, Hong
    Lin, Zhuosheng
    Chen, Tao
    Li, Shengke
    Qiu, Shihan
    Liu, Qichao
    Ma, Yuangang
    Zhang, Shuangsheng
    KNOWLEDGE-BASED SYSTEMS, 2023, 281
  • [32] Olive Disease Classification Based on Vision Transformer and CNN Models
    Alshammari, Hamoud
    Gasmi, Karim
    Ben Ltaifa, Ibtihel
    Krichen, Moez
    Ben Ammar, Lassaad
    Mahmood, Mahmood A.
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [33] SleepZzNet: Sleep Stage Classification Using Single-Channel EEG Based on CNN and Transformer
    Chen, Huiyu
    Yin, Zhigang
    Zhang, Peng
    Liu, Panfei
    INTERNATIONAL JOURNAL OF PSYCHOPHYSIOLOGY, 2021, 168 : S167 - S167
  • [34] Automatic cervical cancer classification using adaptive vision transformer encoder with CNN for medical application
    Nirmala, G.
    Nayudu, P. Prathap
    Kumar, A. Ranjith
    Sagar, Renuka
    PATTERN RECOGNITION, 2025, 160
  • [35] Optimized Input for CNN-Based Hyperspectral Image Classification Using Spatial Transformer Network
    He, Xin
    Chen, Yushi
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2019, 16 (12) : 1884 - 1888
  • [36] Computationally optimized brain tumor classification using attention based GoogLeNet-style CNN
    Subba, Anjana Bharati
    Sunaniya, Arun Kumar
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 260
  • [37] Architectural style classification based on CNN and channel–spatial attention
    Bo Wang
    Sulan Zhang
    Jifu Zhang
    Zhenjiao Cai
    Signal, Image and Video Processing, 2023, 17 : 99 - 107
  • [38] A Comparative Study of CNN- and Transformer-Based Visual Style Transfer
    Wei, Hua-Peng
    Deng, Ying-Ying
    Tang, Fan
    Pan, Xing-Jia
    Dong, Wei-Ming
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 37 (03) : 601 - 614
  • [39] A Comparative Study of CNN- and Transformer-Based Visual Style Transfer
    Hua-Peng Wei
    Ying-Ying Deng
    Fan Tang
    Xing-Jia Pan
    Wei-Ming Dong
    Journal of Computer Science and Technology, 2022, 37 : 601 - 614
  • [40] TONAL COMPLEXITY FEATURES FOR STYLE CLASSIFICATION OF CLASSICAL MUSIC
    Weiss, Christo
    Mueller, Meinard
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 688 - 692