Study of Automatic Piano Transcription Algorithms based on the Polyphonic Properties of Piano Audio

被引:0
|
作者
Liang Y. [1 ]
Pan F. [2 ]
机构
[1] Department of Educational Sciences and Music, Luoyang Institute of Science and Technology, Henan, Luoyang
[2] Department of Sports Training, Guangzhou Sport University, Guangdong, Guangzhou
关键词
Automatic transcription; Convolutional neural network; Piano audio; Polyphonic characteristics;
D O I
10.5573/IEIESPC.2023.12.5.412
中图分类号
学科分类号
摘要
The polyphonic characteristics of piano audio make automatic transcription particularly challenging. This study briefly analyzed the polyphonic characteristics of piano audio and introduced three piano audio features: short-time Fourier transform (STFT), constant-Q transform (CQT), and variable-Q transform (VQT). An algorithm integrating a convolutional neural network (CNN) with a bidirectional gated recurrent unit (BiGRU) was developed and tested on the MAPS dataset to detect the note start and end points and fundamental tones of polyphone. The results showed that the combined algorithm performed better than STFT and CQT when VQT was used as input, and CNN-BiGRU outperformed CNN and CNN-GRU in terms of the P value, R-value, and F1-measure in the fundamental tone detection of 97.16%, 97.34%, and 97.25%, respectively. The experimental results of this paper confirmed that the designed automatic piano transcription algorithm is reliable and can be further adopted in the practical music field. Copyrights © 2023 The Institute of Electronics and Information Engineers.
引用
收藏
页码:412 / 418
页数:6
相关论文
共 50 条
  • [21] POLYPHONIC PIANO NOTE TRANSCRIPTION WITH RECURRENT NEURAL NETWORKS
    Boeck, Sebastian
    Schedl, Markus
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 121 - 124
  • [22] Automatic Transcription of Polyphonic Piano Music Using Genetic Algorithms, Adaptive Spectral Envelope Modeling, and Dynamic Noise Level Estimation
    Reis, Gustavo
    Fernandez de Vega, Francisco
    Ferreira, Anibal
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2313 - 2328
  • [23] Towards automatic music transcription: Extraction of midi-data out of polyphonic piano music
    Wellhausen, J
    Krause, I
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL VI, PROCEEDINGS: IMAGE, ACOUSTIC, SIGNAL PROCESSING AND OPTICAL SYSTEMS, TECHNOLOGIES AND APPLICATIONS, 2004, : 114 - 118
  • [24] Transcription of polyphonic piano music by means of memory-based classification method
    Costantini, Giovanni
    Todisco, Massimiliano
    Perfetti, Renzo
    NEURAL NETS WIRN09, 2009, 204 : 91 - 100
  • [25] Non-linear effects modeling for polyphonic piano transcription
    Ortiz-Berenguer, LI
    Casajús-Quirós, FJ
    Torres-Guijarro, M
    DAFX-03: 6TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS, PROCEEDINGS, 2003, : 78 - 83
  • [26] A supervised classification approach for note tracking in polyphonic piano transcription
    Valero-Mas, Jose J.
    Benetos, Emmanouil
    Inesta, Jose M.
    JOURNAL OF NEW MUSIC RESEARCH, 2018, 47 (03) : 249 - 263
  • [27] End-to-End Real-World Polyphonic Piano Audio-to-Score Transcription with Hierarchical Decoding
    Zeng, Wei
    He, Xian
    Wang, Ye
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 7788 - 7795
  • [28] AN AUDIO-VISUAL FUSION PIANO TRANSCRIPTION APPROACH BASED ON STRATEGY
    Wang, Xianke
    Xu, Wei
    Liu, Juanting
    Yang, Weiming
    Cheng, Wenqing
    2021 24TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS (DAFX), 2021, : 308 - 315
  • [29] An End-to-End Neural Network for Polyphonic Piano Music Transcription
    Sigtia, Siddharth
    Benetos, Emmanouil
    Dixon, Simon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (05) : 927 - 939
  • [30] SVM Based Transcription System with Short-Term Memory Oriented to Polyphonic Piano Music
    Costantini, Giovanni
    Todisco, Massimiliano
    Perfetti, Renzo
    Basili, Roberto
    Casali, Daniele
    MELECON 2010: THE 15TH IEEE MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, 2010, : 196 - 201