Codebook-Based Audio Feature Representation for Music Information Retrieval

被引:32
|
作者
Vaizman, Yonatan [1 ]
McFee, Brian [2 ,3 ]
Lanckriet, Gert [1 ]
机构
[1] Univ Calif San Diego, Dept Elect & Comp Engn, La Jolla, CA 92093 USA
[2] Columbia Univ, Ctr Jazz Studies, New York, NY 10027 USA
[3] Columbia Univ, LabROSA, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
Audio content representations; music information retrieval; music recommendation; sparse coding; vector quantization; SIMILARITY;
D O I
10.1109/TASLP.2014.2337842
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Digital music has become prolific in the web in recent decades. Automated recommendation systems are essential for users to discover music they love and for artists to reach appropriate audience. When manual annotations and user preference data is lacking (e. g. for new artists) these systems must rely on content based methods. Besides powerful machine learning tools for classification and retrieval, a key component for successful recommendation is the audio content representation. Good representations should capture informative musical patterns in the audio signal of songs. These representations should be concise, to enable efficient (low storage, easy indexing, fast search) management of huge music repositories, and should also be easy and fast to compute, to enable real-time interaction with a user supplying new songs to the system. Before designing new audio features, we explore the usage of traditional local features, while adding a stage of encoding with a pre-computed codebook and a stage of pooling to get compact vectorial representations. We experiment with different encoding methods, namely the LASSO, vector quantization (VQ) and cosine similarity (CS). We evaluate the representations' quality in two music information retrieval applications: query-by-tag and query-by-example. Our results show that concise representations can be used for successful performance in both applications. We recommend using top-VQ encoding, which consistently performs well in both applications, and requires much less computation time than the LASSO.
引用
收藏
页码:1483 / 1493
页数:11
相关论文
共 50 条
  • [1] An Analysis of Dependency of Prior Probability for Codebook-Based Image Representation
    Shinomiya, Yuki
    Hoshino, Yukinobu
    2016 JOINT 8TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 17TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2016, : 103 - 108
  • [2] An Analysis of Dependency of Prior Probability for Codebook-Based Image Representation
    1600, Institute of Electrical and Electronics Engineers Inc., United States
  • [3] Visualization in audio-based music information retrieval
    Cooper, Matthew
    Foote, Jonathan
    Pampalk, Elias
    Tzanetakis, George
    COMPUTER MUSIC JOURNAL, 2006, 30 (02) : 42 - 62
  • [4] A comprehensive study on codebook-based feature fusion for gait recognition
    Khan, Muhammad Hassan
    Farid, Muhammad Shahid
    Grzegorzek, Marcin
    INFORMATION FUSION, 2023, 92 : 216 - 230
  • [5] Audio Features in Music Information Retrieval
    Grzywczak, Daniel
    Gwardys, Grzegorz
    ACTIVE MEDIA TECHNOLOGY, AMT 2014, 2014, 8610 : 187 - 199
  • [6] MULTI-PITCH ESTIMATION OF AUDIO RECORDINGS USING A CODEBOOK-BASED APPROACH
    Hansen, Martin Weiss
    Jensen, Jesper Rindom
    Christensen, Mads Graesboll
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 983 - 987
  • [7] Improved Algorithms of Music Information Retrieval based on Audio Fingerprint
    Jie, Tang
    Gang, Liu
    Jun, Guo
    IITAW: 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATIONS WORKSHOPS, 2009, : 367 - 371
  • [8] Finding structure in audio for music information retrieval
    Pardo, B
    IEEE SIGNAL PROCESSING MAGAZINE, 2006, 23 (03) : 126 - 132
  • [9] Audio indexing for efficient music information retrieval
    Karydis, I
    Nanopoulos, A
    Papadopoulos, AN
    Manolopoulos, Y
    11TH INTERNATIONAL MULTIMEDIA MODELLING CONFERENCE, PROCEEDINGS, 2005, : 22 - 29
  • [10] Multi-Scale Multi-Feature Codebook-Based Background Subtraction
    Zaharescu, Andrei
    Jamieson, Michael
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,