Polyphonic Piano Transcription with a Note-Based Music Language Model

被引:6
|
作者
Wang, Qi [1 ,2 ]
Zhou, Ruohua [1 ,2 ]
Yan, Yonghong [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Speech Acoust & Content Understanding, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
[3] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830001, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2018年 / 8卷 / 03期
基金
中国国家自然科学基金;
关键词
polyphonic piano transcription; note-based music language model; recurrent neural network; restricted Boltzmann machine;
D O I
10.3390/app8030470
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
This paper proposes a note-based music language model (MLM) for improving note-level polyphonic piano transcription. The MLM is based on the recurrent structure, which could model the temporal correlations between notes in music sequences. To combine the outputs of the note-based MLM and acoustic model directly, an integrated architecture is adopted in this paper. We also propose an inference algorithm, in which the note-based MLM is used to predict notes at the blank onsets in the thresholding transcription results. The experimental results show that the proposed inference algorithm improves the performance of note-level transcription. We also observe that the combination of the restricted Boltzmann machine (RBM) and recurrent structure outperforms a single recurrent neural network (RNN) or long short-term memory network (LSTM) in modeling the high-dimensional note sequences. Among all the MLMs, LSTM-RBM helps the system yield the best results on all evaluation metrics regardless of the performance of acoustic models.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] SIC RECEIVER FOR POLYPHONIC PIANO MUSIC
    Barbancho, Ana M.
    Barbancho, Isabel
    Soto, Beatriz
    Tardon, Lorenzo J.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 377 - 380
  • [32] PIANO MUSIC TRANSCRIPTION MODELING NOTE TEMPORAL EVOLUTION
    Cogliati, Andrea
    Duan, Zhiyao
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 429 - 433
  • [33] Improving generalization for classification-based polyphonic piano transcription
    Poliner, Graham E.
    Ellis, Daniel P. W.
    2007 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2007, : 309 - 312
  • [34] Evaluation of the Convolutional NMF for Supervised Polyphonic Music Transcription and Note Isolation
    Gorlow, Stanislaw
    Janer, Jordi
    LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, LVA/ICA 2015, 2015, 9237 : 437 - 445
  • [35] Automatic polyphonic piano music transcription by a multi-classification discriminative-learning
    D'Urso, S
    Uncini, A
    NEURAL NETS, 2003, 2859 : 129 - 138
  • [36] POLYPHONIC PIANO NOTE TRANSCRIPTION WITH NON-NEGATIVE MATRIX FACTORIZATION OF DIFFERENTIAL SPECTROGRAM
    Gao, Lufei
    Su, Li
    Yang, Yi-Hsuan
    Lee, Tan
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 291 - 295
  • [37] Note identification in polyphonic music
    Wagstaff, Julian
    Acoustics Bulletin, 24 (04):
  • [38] Unsupervised note activity detection in NMF-based automatic transcription of piano music
    Tavares, Tiago Fernandes
    Arnal Barbedo, Jayme Garcia
    Attux, Romis
    JOURNAL OF NEW MUSIC RESEARCH, 2016, 45 (02) : 118 - 123
  • [39] POLYPHONIC MUSIC TRANSCRIPTION WITH SEMANTIC SEGMENTATION
    Wu, Yu-Te
    Chen, Berlin
    Su, Li
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 166 - 170
  • [40] Automatic Transcription of Polyphonic Vocal Music
    McLeod, Andrew
    Schramm, Rodrigo
    Steedman, Mark
    Benetos, Emmanouil
    APPLIED SCIENCES-BASEL, 2017, 7 (12):