Audio-Based Music Classification with DenseNet and Data Augmentation

被引：14

作者：

Bian, Wenhao ^{[1
,2
]}

Wang, Jie ^{[2
]}

Zhuang, Bojin ^{[2
]}

Yang, Jiankui ^{[1
]}

Wang, Shaojun ^{[2
]}

Xiao, Jing ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommn, Beijing, Peoples R China

[2] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China

来源：

PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III | 2019年 / 11672卷

关键词：

Music classification; Spectrogram; CNN; ResNet; DenseNet; Deep learning;

D O I：

10.1007/978-3-030-29894-4_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.

引用

页码：56 / 65

页数：10

共 50 条

[31] A 15-Category Audio Dataset for Drones and an Audio-Based UAV Classification Using Machine Learning
Wang, Mia Yaqin
Chu, Zhiwei
Ku, Ilmun
Smith, E. Cho
Matson, Eric T.
INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2024, 18 (02) : 257 - 272
[32] Adaptive Audio-Based Context Recognition
Dargie, Waltenegus
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2009, 39 (04): : 715 - 725
[33] Music genre classification of MPEG AAC audio data
Kobayakawa, Michihiro
Hoshi, Mamoru
Yuzawa, Koichiro
2014 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2014, : 347 - 352
[34] Audio Surveillance: Detection of Audio-Based Emergency Situations
Dosbayev, Zhandos
Abdrakhmanov, Rustam
Akhmetova, Oxana
Nurtas, Marat
Iztayev, Zhalgasbek
Zhaidakbaeva, Lyazzat
Shaimerdenova, Lazzat
ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 1463 : 413 - 424
[35] Audio Songs Classification Based on Music Patterns
Sharma, Rahul
Murthy, Y. V. Srinivasa
Koolagudi, Shashidhar G.
PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 3, 2016, 381 : 157 - 166
[36] MELON PLAYLIST DATASET: A PUBLIC DATASET FOR AUDIO-BASED PLAYLIST GENERATION AND MUSIC TAGGING
Ferraro, Andres
Kim, Yuntae
Lee, Soohyeon
Kim, Biho
Jo, Namjun
Lim, Semi
Lim, Suyon
Jang, Jungtaek
Kim, Sehwan
Serra, Xavier
Bogdanov, Dmitry
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 536 - 540
[37] Audio-Based Epileptic Seizure Detection
Ahsan, M. N. Istiaq
Kertesz, Csaba
Mesaros, Annamaria
Heittola, Toni
Knight, Andrew
Virtanen, Tuomas
2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
[38] Sound Event Classification with Feature Vector Combination for Automatic Audio-based Surveillance
Lee, Seunghyung
Park, Jinuk
Park, Sangjun
Hahn, Minsoo
2016 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2016,
[39] EXPLORING META INFORMATION FOR AUDIO-BASED ZERO-SHOT BIRD CLASSIFICATION
Gebhard, Alexander
Triantafyllopoulos, Andreas
Bez, Teresa
Christ, Lukas
Kathan, Alexander
Schuller, Bjoern W.
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 1211 - 1215
[40] Audio-based Gender and Age Identification
Bozkurt, O. Ozgur
Taysi, Z. Cihan
2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 1371 - 1374

← 1 2 3 4 5 →