A Music Cognition-Guided Framework for Multi-pitch Estimation

被引:2
|
作者
Li, Xiaoquan [1 ]
Yan, Yijun [2 ]
Soraghan, John [1 ]
Wang, Zheng [3 ]
Ren, Jinchang [2 ]
机构
[1] Univ Strathclyde, Dept Elect & Elect Engn, Glasgow, Lanark, Scotland
[2] Robert Gordon Univ, Natl Subsea Ctr, Aberdeen AB21 0BH, Scotland
[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
关键词
Music cognition; Automatic music transcription; Multi-pitch estimation; Harmonic structure detection (HSD); Polyphonic music detection; TRANSCRIPTION; NETWORK;
D O I
10.1007/s12559-022-10031-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As one of the most important subtasks of automatic music transcription (AMT), multi-pitch estimation (MPE) has been studied extensively for predicting the fundamental frequencies in the frames of audio recordings during the past decade. However, how to use music perception and cognition for MPE has not yet been thoroughly investigated. Motivated by this, this demonstrates how to effectively detect the fundamental frequency and the harmonic structure of polyphonic music using a cognitive framework. Inspired by cognitive neuroscience, an integration of the constant Q transform and a state-of-the-art matrix factorization method called shift-invariant probabilistic latent component analysis (SI-PLCA) are proposed to resolve the polyphonic short-time magnitude log-spectra for multiple pitch estimation and source-specific feature extraction. The cognitions of rhythm, harmonic periodicity and instrument timbre are used to guide the analysis of characterizing contiguous notes and the relationship between fundamental frequency and harmonic frequencies for detecting the pitches from the outcomes of SI-PLCA. In the experiment, we compare the performance of proposed MPE system to a number of existing state-of-the-art approaches (seven weak learning methods and four deep learning methods) on three widely used datasets (i.e. MAPS, BACH10 and TRIOS) in terms of F-measure (F-1) values. The experimental results show that the proposed MPE method provides the best overall performance against other existing methods.
引用
收藏
页码:23 / 35
页数:13
相关论文
共 50 条
  • [31] Multi-Pitch Estimation using NHF with Multi-Dictionary Distinguishing Attack and Reverberation of Sounds
    Fujisawa, Takanori
    Harada, Sora
    Ikehara, Masaaki
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1836 - 1841
  • [32] MULTI-PITCH ESTIMATION OF AUDIO RECORDINGS USING A CODEBOOK-BASED APPROACH
    Hansen, Martin Weiss
    Jensen, Jesper Rindom
    Christensen, Mads Graesboll
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 983 - 987
  • [33] JOINT DOA AND MULTI-PITCH ESTIMATION VIA BLOCK SPARSE DICTIONARY LEARNING
    Kronvall, Ted
    Adalbjornsson, Stefan Ingi
    Jakobsson, Andreas
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1053 - 1057
  • [34] TOWARDS COMPLETE POLYPHONIC MUSIC TRANSCRIPTION: INTEGRATING MULTI-PITCH DETECTION AND RHYTHM QUANTIZATION
    Nakamura, Eita
    Benetos, Emmanouil
    Yoshii, Kazuyoshi
    Dixon, Simon
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 101 - 105
  • [35] An efficient approach combined with harmonic and shift invariance for piano music multi-pitch detection
    Deng, Kai
    Liu, Gang
    Huang, Yuzhi
    FOURTH INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2019, 11198
  • [36] Multi-pitch Estimation Based on Sparse Representation with Pre-screened Dictionary
    Gao, Lufei
    Lee, Tan
    2015 IEEE 17TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2015,
  • [37] Deep Neural Network for Multi-Pitch Estimation Using Weighted Cross Entropy Loss
    Stone, Samuel
    Spector, Evan
    2021 IEEE WESTERN NEW YORK IMAGE AND SIGNAL PROCESSING WORKSHOP (WNYISPW), 2021,
  • [38] Multi-pitch Streaming of Harmonic Sound Mixtures
    Duan, Zhiyao
    Han, Jinyu
    Pardo, Bryan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 138 - 150
  • [39] A Novel Cognition-Guided Neurofeedback BCI Dataset on Nicotine Addiction
    Bu, Junjie
    Liu, Chang
    Gou, Huixing
    Gan, Hefan
    Cheng, Yan
    Liu, Mengyuan
    Ni, Rui
    Liang, Zhen
    Cui, Guanbao
    Zeng, Ginger Qinghong
    Zhang, Xiaochu
    FRONTIERS IN NEUROSCIENCE, 2021, 15
  • [40] A multi-pitch tracking algorithm for noisy speech
    Wu, MY
    Wang, DL
    Brown, GJ
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 369 - 372