FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC

被引:0
|
作者
Zhu, Bilei [1 ]
Wu, Fuzhang [1 ]
Li, Ke [1 ]
Wu, Yongjian [1 ]
Huang, Feiyue [1 ]
Wu, Yunsheng [1 ]
机构
[1] Tencent Youtu AI Lab, Shenzhen, Peoples R China
关键词
Singing melody transcription; polyphonic audio; monophonic singing recordings; deep neural network (DNN); pitch sequence selection; SPEECH;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a new system for singing melody transcription from polyphonic songs. Instead of operating solely on polyphonic audio of each song to be processed (as most existing systems do), our system takes as inputs additionally multiple monophonic recordings of people singing the song. To transcribe the singing melody in a song, our system first tracks the singing pitch from polyphonic audio of the song by using a deep neural network (DNN)-based method, and then uses the estimated pitch series as reference to select the pitch sequences extracted from the multiple monophonic singing recordings. The selected monophonic pitch sequences, as well as the DNN pitch series from the polyphonic audio, are then transcribed separately, and their transcriptions results are fused to form the final note sequence. Experimental results show that, by introducing monophonic singings into transcription, the performance of singing melody transcription from polyphonic songs can be significantly improved.
引用
收藏
页码:296 / 300
页数:5
相关论文
共 50 条
  • [21] NOTE ONSET DETECTION FOR THE TRANSCRIPTION OF POLYPHONIC PIANO MUSIC
    Boogaart, C. G. V. D.
    Lienhart, R.
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 446 - 449
  • [22] Polyphonic music transcription using note event modeling
    Ryynänen, MP
    Klapuri, A
    2005 WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2005, : 319 - 322
  • [23] Poisson point process modeling for polyphonic music transcription
    Peeling, Paul
    Li, Chung-fai
    Godsill, Simon
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2007, 121 (04): : EL168 - EL175
  • [24] Melody transcription from music audio:: Approaches and evaluation
    Poliner, Graham E.
    Ellis, Daniel P. W.
    Ehmann, Andreas F.
    Gomez, Emilia
    Streich, Sebastian
    Ong, Beesuan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1247 - 1256
  • [25] Event based transcription system for polyphonic piano music
    Costantini, Giovanni
    Perfetti, Renzo
    Todisco, Massimiliano
    SIGNAL PROCESSING, 2009, 89 (09) : 1798 - 1811
  • [26] Polyphonic monotimbral music transcription using dynamic networks
    Pertusa, A
    Inesta, JM
    PATTERN RECOGNITION LETTERS, 2005, 26 (12) : 1809 - 1818
  • [28] Singing with the frogs (Polyphonic music and homophonic music)
    Bringhurst, R
    CANADIAN LITERATURE, 1997, (155): : 114 - 134
  • [29] From the monophonic university to polyphonic multiversities
    Wildman, P
    FUTURES, 1998, 30 (07) : 625 - 633
  • [30] Factors in factorization: Does better audio source separation imply better polyphonic music transcription ?
    Tavares, Tiago Fernandes
    Tzanetakis, George
    Driessen, Peter
    2013 IEEE 15TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2013, : 424 - 428