Codebook-Based Audio Feature Representation for Music Information Retrieval

被引:32
|
作者
Vaizman, Yonatan [1 ]
McFee, Brian [2 ,3 ]
Lanckriet, Gert [1 ]
机构
[1] Univ Calif San Diego, Dept Elect & Comp Engn, La Jolla, CA 92093 USA
[2] Columbia Univ, Ctr Jazz Studies, New York, NY 10027 USA
[3] Columbia Univ, LabROSA, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
Audio content representations; music information retrieval; music recommendation; sparse coding; vector quantization; SIMILARITY;
D O I
10.1109/TASLP.2014.2337842
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Digital music has become prolific in the web in recent decades. Automated recommendation systems are essential for users to discover music they love and for artists to reach appropriate audience. When manual annotations and user preference data is lacking (e. g. for new artists) these systems must rely on content based methods. Besides powerful machine learning tools for classification and retrieval, a key component for successful recommendation is the audio content representation. Good representations should capture informative musical patterns in the audio signal of songs. These representations should be concise, to enable efficient (low storage, easy indexing, fast search) management of huge music repositories, and should also be easy and fast to compute, to enable real-time interaction with a user supplying new songs to the system. Before designing new audio features, we explore the usage of traditional local features, while adding a stage of encoding with a pre-computed codebook and a stage of pooling to get compact vectorial representations. We experiment with different encoding methods, namely the LASSO, vector quantization (VQ) and cosine similarity (CS). We evaluate the representations' quality in two music information retrieval applications: query-by-tag and query-by-example. Our results show that concise representations can be used for successful performance in both applications. We recommend using top-VQ encoding, which consistently performs well in both applications, and requires much less computation time than the LASSO.
引用
收藏
页码:1483 / 1493
页数:11
相关论文
共 50 条
  • [31] A codebook-based video moving objects detecting method
    Chu, Yong
    Zhu, Hong
    Wang, Dong
    2006 IEEE INTERNATIONAL CONFERENCE ON INFORMATION ACQUISITION, VOLS 1 AND 2, CONFERENCE PROCEEDINGS, 2006, : 43 - 47
  • [32] Codebook-Based Near-Duplicate Video Detection
    Hernandez, Guillermo
    Gonzalez Arrieta, Angelica
    Novais, Paulo
    Rodriguez, Sara
    16TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING MODELS IN INDUSTRIAL AND ENVIRONMENTAL APPLICATIONS (SOCO 2021), 2022, 1401 : 283 - 293
  • [33] Codebook-based Bayesian speech enhancement for nonstationary environments
    Srinivasan, Sriram
    Samuelsson, Jonas
    Kleijn, W. Bastiaan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02): : 441 - 452
  • [34] RIS codebook-based beamsteering validation and field trials
    Wang, Yiwen
    Wang, Weimin
    Wu, Yongle
    Fan, Wei
    ELECTRONICS LETTERS, 2024, 60 (14)
  • [35] Codebook-based precoding for generalized spatial modulation with diversity
    Essam Sourour
    EURASIP Journal on Wireless Communications and Networking, 2019
  • [36] Probabilistic Codebook-Based Fault Localization in Data Networks
    Reali, Gianluca
    Femminella, Mauro
    Monacelli, Luca
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2018, 15 (02): : 567 - 581
  • [37] Codebook-based precoding for generalized spatial modulation with diversity
    Sourour, Essam
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2019, 2019 (01)
  • [38] Quick audio retrieval based on histogram feature sequences
    Kashino, Kunio
    Smith, Gavin
    Murase, Hiroshi
    Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 2000, 21 (04): : 217 - 219
  • [39] Integration of text and audio features for genre classification in music information retrieval
    Neumayer, Robert
    Rauber, Andreas
    ADVANCES IN INFORMATION RETRIEVAL, 2007, 4425 : 724 - +
  • [40] An evaluation of feature extraction for query-by-content audio information retrieval
    Yu, Yi
    Downie, J. Stephen
    Joe, Kazuki
    ISM WORKSHOPS 2007: NINTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA - WORKSHOPS, PROCEEDINGS, 2007, : 297 - +