Codebook-Based Audio Feature Representation for Music Information Retrieval

被引:32
|
作者
Vaizman, Yonatan [1 ]
McFee, Brian [2 ,3 ]
Lanckriet, Gert [1 ]
机构
[1] Univ Calif San Diego, Dept Elect & Comp Engn, La Jolla, CA 92093 USA
[2] Columbia Univ, Ctr Jazz Studies, New York, NY 10027 USA
[3] Columbia Univ, LabROSA, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
Audio content representations; music information retrieval; music recommendation; sparse coding; vector quantization; SIMILARITY;
D O I
10.1109/TASLP.2014.2337842
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Digital music has become prolific in the web in recent decades. Automated recommendation systems are essential for users to discover music they love and for artists to reach appropriate audience. When manual annotations and user preference data is lacking (e. g. for new artists) these systems must rely on content based methods. Besides powerful machine learning tools for classification and retrieval, a key component for successful recommendation is the audio content representation. Good representations should capture informative musical patterns in the audio signal of songs. These representations should be concise, to enable efficient (low storage, easy indexing, fast search) management of huge music repositories, and should also be easy and fast to compute, to enable real-time interaction with a user supplying new songs to the system. Before designing new audio features, we explore the usage of traditional local features, while adding a stage of encoding with a pre-computed codebook and a stage of pooling to get compact vectorial representations. We experiment with different encoding methods, namely the LASSO, vector quantization (VQ) and cosine similarity (CS). We evaluate the representations' quality in two music information retrieval applications: query-by-tag and query-by-example. Our results show that concise representations can be used for successful performance in both applications. We recommend using top-VQ encoding, which consistently performs well in both applications, and requires much less computation time than the LASSO.
引用
收藏
页码:1483 / 1493
页数:11
相关论文
共 50 条
  • [41] Codebook-Based Precoding for SDMA-OFDMA with Spectrum Sharing
    Jo, Han-Shin
    ETRI JOURNAL, 2011, 33 (06) : 831 - 840
  • [42] Blind Bandwidth Extension for Codebook-based Bayesian Speech Enhancement
    Li, Yaxing
    Kim, Jonghyeon
    Kang, Sangwon
    18TH IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS (ISCE 2014), 2014,
  • [43] Adaptive maintenance scheme for codebook-based dynamic background subtraction
    Zeng, Zhi
    Jia, Jianyuan
    Zhu, Zhaofei
    Yu, Dalin
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2016, 152 : 58 - 66
  • [44] A Fast and Efficient Codebook-Based RIS Phase Configuration Method
    Haskou, Abdullah
    Khaleghi, Hamidreza
    2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2023, : 13 - 17
  • [45] Using a Generic Model for Codebook-based Gait Recognition Algorithms
    Khan, Muhammad Hassan
    Farid, Muhammad Shahid
    Grzegorzek, Marcin
    2018 6TH INTERNATIONAL WORKSHOP ON BIOMETRICS AND FORENSICS (IWBF), 2018,
  • [46] Codebook-Based Solutions for Reconfigurable Intelligent Surfaces and Their Open Challenges
    An, Jiancheng
    Xu, Chao
    Wu, Qingqing
    Ng, Derrick Wing Kwan
    Di Renzo, Marco
    Yuen, Chau
    Hanzo, Lajos
    IEEE WIRELESS COMMUNICATIONS, 2024, 31 (02) : 134 - 141
  • [47] Codebook-based Speech Enhancement with Bayesian LP Parameters Estimation
    Wang, Qing
    Bao, Chang-chun
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1245 - 1248
  • [48] HMM-based music retrieval using stereophonic feature information and framelength adaptation
    Schuller, B
    Rigoll, G
    Lang, M
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL II, PROCEEDINGS, 2003, : 713 - 716
  • [49] Robust music information retrieval on mobile network based on multi-feature clustering
    Yoon, Won-Jung
    Oh, Sanghun
    Park, Kyu-Sik
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 279 - 283
  • [50] Entropy Optimized Feature-Based Bag-of-Words Representation for Information Retrieval
    Passalis, Nikolaos
    Tefas, Anastasios
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (07) : 1664 - 1677