Music Auto-tagging Algorithm Based on Deep Analysis on Labels

被引:0
|
作者
Wang Z. [1 ]
Zhang R. [1 ]
Gao Y. [1 ]
Xiao Y. [1 ]
机构
[1] School of Software Engineering, South China University of Technology, Guangzhou, 510006, Guangdong
关键词
Deep neural network; Music auto-tagging; Music label vector;
D O I
10.12141/j.issn.1000-565X.180273
中图分类号
学科分类号
摘要
Deep neural network algorithms have made breakthroughs in automatic labeling tasks, but it is still hard to solve the noise data problem in real music dataset. A music auto-tagging algorithm based on deep analysis on labels (DAL) which captures the potential relationship between audio features and music tags was proposed. The algorithm first extracts the audio features through a multi-level convolutional network, and then learn the vector representation of music tags to reduce the adverse effects of noise data. The experimental results show that the proposed algorithm can achieve higher mean area under receiver operating characteristic curve (AUROCC) and outperform other auto-tagging methods. © 2019, Editorial Department, Journal of South China University of Technology. All right reserved.
引用
收藏
页码:71 / 76
页数:5
相关论文
共 28 条
  • [1] Choi K., Fazekas G., Sandler M., Automatic tagging using deep convolutional neural networks, Proceedings of the 17th International Society for Music Information Retrieval Conference, pp. 805-811, (2016)
  • [2] Lamere P., Social tagging and music information retrieval, Journal of New Music Research, 37, 2, pp. 101-114, (2008)
  • [3] Wulfing J., Riedmiller M.A., Unsupervised learning of local features for music classification, Proceedings of the 13th International Society for Music Information Retrieval Conference, pp. 139-144, (2012)
  • [4] Nam J., Herrera J., Slaney M., Et al., Learning sparse feature representations for music annotation and retrieval, Proceedings of the 13th International Society for Music Information Retrieval Conference, pp. 565-570, (2012)
  • [5] Dieleman S., Schrauwen B., Multiscale approaches to music audio feature learning, Proceedings of the 14th International Society for Music Information Retrieval Conference, pp. 116-121, (2013)
  • [6] Henaff M., Jarrett K., Kavukcuoglu K., Et al., Unsupervised learning of sparse features for scalable audio classification, Proceedings of the 12th International Society for Music Information Retrieval Conference, pp. 681-686, (2011)
  • [7] Vaizman Y., Mcfee B., Lanckriet G., Codebook-based audio feature representation for music information retrieval, IEEE/ACM Transactions on Audio, Speech and Language Processing, 22, 10, pp. 1483-1493, (2014)
  • [8] Schluter J., Osendorfer C., Music similarity estimation with the mean-covariance restricted Boltzmann machine, Proceedings of 10th International Conference on Machine Learning and Applications and Workshops, pp. 118-123, (2011)
  • [9] Su L., Yeh C.C.M., Liu J.Y., Et al., A systematic evaluation of the bag-of-frames representation for music information retrieval, IEEE Transactions on Multimedia, 16, 5, pp. 1188-1200, (2014)
  • [10] Coates A., Ng A., Lee H., An analysis of single-layer networks in unsupervised feature learning, Journal of Machine Learning Research, 15, pp. 215-223, (2011)