Polyphonic Piano Transcription with a Note-Based Music Language Model

被引:6
|
作者
Wang, Qi [1 ,2 ]
Zhou, Ruohua [1 ,2 ]
Yan, Yonghong [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Speech Acoust & Content Understanding, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
[3] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830001, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2018年 / 8卷 / 03期
基金
中国国家自然科学基金;
关键词
polyphonic piano transcription; note-based music language model; recurrent neural network; restricted Boltzmann machine;
D O I
10.3390/app8030470
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
This paper proposes a note-based music language model (MLM) for improving note-level polyphonic piano transcription. The MLM is based on the recurrent structure, which could model the temporal correlations between notes in music sequences. To combine the outputs of the note-based MLM and acoustic model directly, an integrated architecture is adopted in this paper. We also propose an inference algorithm, in which the note-based MLM is used to predict notes at the blank onsets in the thresholding transcription results. The experimental results show that the proposed inference algorithm improves the performance of note-level transcription. We also observe that the combination of the restricted Boltzmann machine (RBM) and recurrent structure outperforms a single recurrent neural network (RNN) or long short-term memory network (LSTM) in modeling the high-dimensional note sequences. Among all the MLMs, LSTM-RBM helps the system yield the best results on all evaluation metrics regardless of the performance of acoustic models.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] NOTE ONSET DETECTION FOR THE TRANSCRIPTION OF POLYPHONIC PIANO MUSIC
    Boogaart, C. G. V. D.
    Lienhart, R.
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 446 - 449
  • [2] Note-Based Sound Source Separation in Monoaural Polyphonic Music
    Aczel, Kristof
    Vajk, Istvan
    ACTA ACUSTICA UNITED WITH ACUSTICA, 2010, 96 (05) : 947 - 958
  • [3] Event based transcription system for polyphonic piano music
    Costantini, Giovanni
    Perfetti, Renzo
    Todisco, Massimiliano
    SIGNAL PROCESSING, 2009, 89 (09) : 1798 - 1811
  • [4] Automatic transcription of piano polyphonic music
    Kobzantsev, A
    Chazan, D
    Zeevi, Y
    ISPA 2005: Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005, : 414 - 418
  • [5] Transcription of polyphonic piano music with neural networks
    Marolt, M
    MELECON 2000: INFORMATION TECHNOLOGY AND ELECTROTECHNOLOGY FOR THE MEDITERRANEAN COUNTRIES, VOLS 1-3, PROCEEDINGS, 2000, : 512 - 515
  • [6] Note-based sound source separation of polyphonic recordings
    Aczel, Kristof
    Vajk, Istvan
    INFOCOMMUNICATIONS JOURNAL, 2009, 1 (01): : 36 - 40
  • [7] DEEP POLYPHONIC ADSR PIANO NOTE TRANSCRIPTION
    Kelz, Rainer
    Boeck, Sebastian
    Widmer, Gerhard
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 246 - 250
  • [8] Generative model based polyphonic music transcription
    Cemgil, AT
    Kappen, B
    Barber, D
    2003 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS PROCEEDINGS, 2003, : 181 - 184
  • [9] A Discriminative Model for Polyphonic Piano Transcription
    Graham E. Poliner
    Daniel P. W. Ellis
    EURASIP Journal on Advances in Signal Processing, 2007
  • [10] A discriminative model for polyphonic piano transcription
    Poliner, Graham E.
    Ellis, Daniel P. W.
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2007, 2007 (1)