Melody transcription from music audio:: Approaches and evaluation

被引:86
|
作者
Poliner, Graham E. [1 ]
Ellis, Daniel P. W.
Ehmann, Andreas F.
Gomez, Emilia
Streich, Sebastian
Ong, Beesuan
机构
[1] Columbia Univ, Dept Elect Engn, LabROSA, New York, NY 10027 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
[3] Univ Pompeu Fabra, Mus Techol Grp, Barcelona 08002, Spain
基金
美国国家科学基金会; 美国安德鲁·梅隆基金会;
关键词
audio; evaluation; melody transcription; music;
D O I
10.1109/TASL.2006.889797
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although the process of analyzing an audio recording of a music performance is complex and difficult even for a human listener, there are limited forms of information that may be tractably extracted and yet still enable interesting applications. We discuss melody-roughly, the part a listener might whistle or hum-as one such reduced descriptor of music audio, and consider how to define it, and what use it might be. We go on to describe the results of full-scale evaluations of melody transcription systems conducted in 2004 and 2005, including an overview of the systems submitted, details of how the evaluations were conducted, and a discussion of the results. For our definition of melody, current systems can achieve around 70% correct transcription at the frame level, including distinguishing between the presence or absence of the melody. Melodies transcribed at this level are readily recognizable, and show promise for practical applications.
引用
收藏
页码:1247 / 1256
页数:10
相关论文
共 50 条
  • [31] Harmonic Adaptive Latent Component Analysis of Audio and Application to Music Transcription
    Fuentes, Benoit
    Badeau, Roland
    Richard, Gael
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (09): : 1854 - 1866
  • [32] Automatic transcription of piano music using audio-vision fusion
    Wan, Yulong
    Wu, Zhigang
    Zhou, Ruohua
    Yan, Yonghong
    MEASUREMENT TECHNOLOGY AND ENGINEERING RESEARCHES IN INDUSTRY, PTS 1-3, 2013, 333-335 : 742 - +
  • [33] Extracting information from music audio
    Ellis, Daniel P. W.
    COMMUNICATIONS OF THE ACM, 2006, 49 (08) : 32 - 37
  • [34] Extracting information from music audio
    Department of Electrical Engineering, Columbia University, NY
    Commun ACM, 2006, 8 (32-37):
  • [35] The Impact of Audio Input Representations on Neural Network based Music Transcription
    Cheuk, Kin Wai
    Agres, Kat
    Herremans, Dorien
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [36] Efficient Vocal Melody Extraction from Polyphonic Music Signals
    Yao, G.
    Zheng, Y.
    Xiao, L.
    Ruan, L.
    Li, Y.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2013, 19 (06) : 103 - 108
  • [37] Graph modeling for vocal melody extraction from polyphonic music
    Zhang, Weiwei
    Yan, Lingyu
    Zhang, Qiaoling
    Gao, Jinyi
    APPLIED ACOUSTICS, 2023, 211
  • [38] GROUP DELAY BASED MELODY MONOPITCH EXTRACTION FROM MUSIC
    Rajan, Rajeev
    Murthy, Hema A.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 186 - 190
  • [39] Songs From This Season TRUE MELODY MUSIC (T Green)
    Murph, John
    DOWN BEAT, 2013, 80 (03): : 66 - 66
  • [40] Ternary Code of Melody and Reliable Audio Watermarking
    Absalyamova, Karina S.
    Latypov, Rustam Kh
    Stolov, Evgeni L.
    2019 27TH TELECOMMUNICATIONS FORUM (TELFOR 2019), 2019, : 524 - 527