Audio-Based Music Classification with DenseNet and Data Augmentation

被引:14
|
作者
Bian, Wenhao [1 ,2 ]
Wang, Jie [2 ]
Zhuang, Bojin [2 ]
Yang, Jiankui [1 ]
Wang, Shaojun [2 ]
Xiao, Jing [2 ]
机构
[1] Beijing Univ Posts & Telecommn, Beijing, Peoples R China
[2] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China
关键词
Music classification; Spectrogram; CNN; ResNet; DenseNet; Deep learning;
D O I
10.1007/978-3-030-29894-4_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.
引用
收藏
页码:56 / 65
页数:10
相关论文
共 50 条
  • [21] An Audio-based Intelligent Fault Classification System for Belt Conveyor Rollers
    Yang, Mingjin
    Peng, Chen
    Li, Zhipeng
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4647 - 4652
  • [22] Data augmentation approaches for improving animal audio classification
    Nanni, Loris
    Maguolo, Gianluca
    Paci, Michelangelo
    ECOLOGICAL INFORMATICS, 2020, 57
  • [23] Cascade of Ordinal Classification and Local Regression for Audio-Based Affect Estimation
    Sazadaly, Maxime
    Pinchon, Pierre
    Fagot, Arthur
    Prevost, Lionel
    Maumy-Bertrand, Myriam
    ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, ANNPR 2018, 2018, 11081 : 268 - 280
  • [24] Audio-based Classification of Swirl Combustion Regimes Using Deep Learning
    Roy, Rishi
    Gupta, Ashwani K.
    PROCEEDINGS OF ASME POWER APPLIED R&D 2023, POWER2023, 2023,
  • [25] Automatic Audio-Based Classification of Patient Inhaler Use: A Pharmacy Based Study
    McNulty, Johnny
    Reilly, Richard B.
    Taylor, Terence E.
    O'Dwyer, Susan M.
    Costello, Richard W.
    Zigel, Yaniv
    2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 2606 - 2609
  • [26] An Audio-Based Deep Learning Framework For BBC Television Programme Classification
    Lam Pham
    Baume, Chris
    Kong, Qiuqiang
    Hussain, Tassadaq
    Wang, Wenwu
    Plumbley, Mark
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 56 - 60
  • [27] Audio-based context recognition
    Eronen, AJ
    Peltonen, VT
    Tuomi, JT
    Klapuri, AP
    Fagerlund, S
    Sorsa, T
    Lorho, G
    Huopaniemi, J
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01): : 321 - 329
  • [28] A Large-Scale UAV Audio Dataset and Audio-Based UAV Classification Using CNN
    Wang, Yaqin
    Chu, Zhiwei
    Ku, Ilmun
    Smith, E. Cho
    Matson, Eric T.
    2022 SIXTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING, IRC, 2022, : 186 - 189
  • [29] Developing an Audio-based Game
    Im, Byoung Uk
    Baek, Nakhoon
    2014 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2014,
  • [30] Genre-Adaptive Semantic Computing and Audio-Based Modelling for Music Mood Annotation
    Saari, Pasi
    Fazekas, Gyorgy
    Eerola, Tuomas
    Barthet, Mathieu
    Lartillot, Olivier
    Sandler, Mark
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2016, 7 (02) : 122 - 135