Polyphonic Piano Transcription with a Note-Based Music Language Model

被引:6
|
作者
Wang, Qi [1 ,2 ]
Zhou, Ruohua [1 ,2 ]
Yan, Yonghong [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Speech Acoust & Content Understanding, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
[3] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830001, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2018年 / 8卷 / 03期
基金
中国国家自然科学基金;
关键词
polyphonic piano transcription; note-based music language model; recurrent neural network; restricted Boltzmann machine;
D O I
10.3390/app8030470
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
This paper proposes a note-based music language model (MLM) for improving note-level polyphonic piano transcription. The MLM is based on the recurrent structure, which could model the temporal correlations between notes in music sequences. To combine the outputs of the note-based MLM and acoustic model directly, an integrated architecture is adopted in this paper. We also propose an inference algorithm, in which the note-based MLM is used to predict notes at the blank onsets in the thresholding transcription results. The experimental results show that the proposed inference algorithm improves the performance of note-level transcription. We also observe that the combination of the restricted Boltzmann machine (RBM) and recurrent structure outperforms a single recurrent neural network (RNN) or long short-term memory network (LSTM) in modeling the high-dimensional note sequences. Among all the MLMs, LSTM-RBM helps the system yield the best results on all evaluation metrics regardless of the performance of acoustic models.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] SVM Based Transcription System with Short-Term Memory Oriented to Polyphonic Piano Music
    Costantini, Giovanni
    Todisco, Massimiliano
    Perfetti, Renzo
    Basili, Roberto
    Casali, Daniele
    MELECON 2010: THE 15TH IEEE MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, 2010, : 196 - 201
  • [22] Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices
    Nakamura, Eita
    Yoshii, Kazuyoshi
    Sagayama, Shigeki
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 794 - 806
  • [23] Summary of Dissertation: Transcription of polyphonic music for piano based in resolution of groups of notes and finite states
    Gómez-Meire S.
    Inteligencia Artificial, 2010, 14 (45) : 44 - 47
  • [24] Polyphonic piano transcription based on graph convolutional network
    Xiao, Zhe
    Chen, Xin
    Zhou, Li
    SIGNAL PROCESSING, 2023, 212
  • [25] POLYPHONIC MUSIC TRANSCRIPTION USING NOTE ONSET AND OFFSET DETECTION
    Benetos, Emmanouil
    Dixon, Simon
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 37 - 40
  • [26] Methodologies for evaluation of note-based music-retrieval systems
    Uitdenbogerd, Alexandra L.
    Chattaraj, Abhijit
    Zobel, Justin
    INFORMS JOURNAL ON COMPUTING, 2006, 18 (03) : 339 - 347
  • [27] Towards automatic music transcription: Extraction of midi-data out of polyphonic piano music
    Wellhausen, J
    Krause, I
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL VI, PROCEEDINGS: IMAGE, ACOUSTIC, SIGNAL PROCESSING AND OPTICAL SYSTEMS, TECHNOLOGIES AND APPLICATIONS, 2004, : 114 - 118
  • [28] JOINT MULTI-PITCH DETECTION AND SCORE TRANSCRIPTION FOR POLYPHONIC PIANO MUSIC
    Liu, Lele
    Morfi, Veronica
    Benetos, Emmanouil
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 281 - 285
  • [29] Polyphonic Piano Music Transcription using Long Short-Term Memory
    Sadekar, Aakash
    Mahajan, Shrinivas P.
    2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
  • [30] Study of Automatic Piano Transcription Algorithms based on the Polyphonic Properties of Piano Audio
    Liang Y.
    Pan F.
    IEIE Transactions on Smart Processing and Computing, 2023, 12 (05): : 412 - 418