Multi-Instrumental Deep Learning for Automatic Genre Recognition

被引:0
|
作者
Klec, Mariusz [1 ]
机构
[1] Polish Japanese Acad Informat Technol, Multimedia Dept, Warsaw, Poland
关键词
RBM; Deep neural network; Automatic genre recognition; Unsupervised Pre-training; Neural networks; Music information retrieval;
D O I
10.1007/978-3-319-31277-4_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The experiments described in this paper utilize songs in the MIDI format to train Deep Neural Networks (DNNs) for the Automatic Genre Recognition (AGR) problem. The MIDI songs were decomposed into separate instrument groups and converted to audio. Restricted Boltzmann Machines (RBMs) were trained with the individual groups of instruments as a method of pre-training of the final DNN models. The Scattering Wavelet Transform (SWT) was used for signal representation. The paper explains the basics of RBMs and the SWT, followed by a review of DNN pre-training methods that use separate instrument audio. Experiments show that this approach allows building better discriminating models than those that were trained using whole songs.
引用
收藏
页码:53 / 61
页数:9
相关论文
共 50 条
  • [41] Automatic Modulation Recognition using Deep Learning Architectures
    Zhang, Meng
    Zeng, Yuan
    Han, Zidong
    Gong, Yi
    2018 IEEE 19TH INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (SPAWC), 2018, : 281 - 285
  • [42] Automatic Facial Expression Recognition Using Deep Learning
    Prasad, M. S. Guru
    Prithviraj
    Choudhury, Tanupriya
    Kotecha, Ketan
    Jain, Deepak
    Yeole, Ashwini N.
    INTELLIGENT AND FUZZY SYSTEMS, INFUS 2024 CONFERENCE, VOL 1, 2024, 1088 : 414 - 426
  • [43] Automatic License Plate Recognition Using Deep Learning
    Dhedhi, Bhavin
    Datar, Prathamesh
    Chiplunkar, Anuj
    Jain, Kashish
    Rangarajan, Amrith
    Kundargi, Jayshree
    ADVANCES IN DATA SCIENCE, 2019, 941 : 46 - 58
  • [44] Data-Transform Multi-Channel Hybrid Deep Learning for Automatic Modulation Recognition
    Qi, Meng
    Shi, Nianfeng
    Wang, Guoqiang
    Shao, Hongxiang
    IEEE ACCESS, 2024, 12 : 59113 - 59121
  • [45] Multi-domain-fusion deep learning for automatic modulation recognition in spatial cognitive radio
    Hou, Shunhu
    Dong, Yaoyao
    Li, Yuhai
    Yan, Qingqing
    Wang, Mengtao
    Fang, Shengliang
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [46] Multi angle optimal pattern-based deep learning for automatic facial expression recognition
    Jain, Deepak Kumar
    Zhang, Zhang
    Huang, Kaiqi
    PATTERN RECOGNITION LETTERS, 2020, 139 : 157 - 165
  • [47] Deep Transductive Transfer Learning for Automatic Target Recognition
    Sami, Shoaib M.
    Nasrabadi, Nasser M.
    Rao, Raghuveer
    AUTOMATIC TARGET RECOGNITION XXXIII, 2023, 12521
  • [48] Multi-instrumental analysis of the day-to-day variability of equatorial plasma bubbles
    Aa, Ercha
    Zhang, Shun-Rong
    Coster, Anthea J.
    Erickson, Philip J.
    Rideout, William
    FRONTIERS IN ASTRONOMY AND SPACE SCIENCES, 2023, 10
  • [49] Multi-instrumental observation of an exceptionally strong Saharan dust outbreak over Portugal
    Preissler, J.
    Wagner, F.
    Pereira, S. N.
    Guerrero-Rascado, J. L.
    JOURNAL OF GEOPHYSICAL RESEARCH-ATMOSPHERES, 2011, 116
  • [50] Multi-instrumental studying of interaction between heavy metal ions and free aminoacids
    Zitka, Ondrej
    Sochor, Jiri
    Huska, Dalibor
    Janicek, Zdenek
    Pavlik, Dusan
    Valla, Martin
    Hrdy, Radim
    Cernei, Natalia
    Hubalek, Jaromir
    Adam, Vojtech
    Horna, Ales
    Provaznik, Ivo
    Kizek, Rene
    JOURNAL OF BIOCHEMICAL TECHNOLOGY, 2010, 2 (05) : S96 - S97