Melody transcription from music audio:: Approaches and evaluation

被引：86

作者：

Poliner, Graham E. ^{[1
]}

Ellis, Daniel P. W.

Ehmann, Andreas F.

Gomez, Emilia

Streich, Sebastian

Ong, Beesuan

机构：

[1] Columbia Univ, Dept Elect Engn, LabROSA, New York, NY 10027 USA

[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA

[3] Univ Pompeu Fabra, Mus Techol Grp, Barcelona 08002, Spain

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 04期

基金：

美国国家科学基金会; 美国安德鲁·梅隆基金会;

关键词：

audio; evaluation; melody transcription; music;

D O I：

10.1109/TASL.2006.889797

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Although the process of analyzing an audio recording of a music performance is complex and difficult even for a human listener, there are limited forms of information that may be tractably extracted and yet still enable interesting applications. We discuss melody-roughly, the part a listener might whistle or hum-as one such reduced descriptor of music audio, and consider how to define it, and what use it might be. We go on to describe the results of full-scale evaluations of melody transcription systems conducted in 2004 and 2005, including an overview of the systems submitted, details of how the evaluations were conducted, and a discussion of the results. For our definition of melody, current systems can achieve around 70% correct transcription at the frame level, including distinguishing between the presence or absence of the melody. Melodies transcribed at this level are readily recognizable, and show promise for practical applications.

引用

页码：1247 / 1256

页数：10

共 50 条

[31] Harmonic Adaptive Latent Component Analysis of Audio and Application to Music Transcription
Fuentes, Benoit
Badeau, Roland
Richard, Gael
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (09): : 1854 - 1866
[32] Automatic transcription of piano music using audio-vision fusion
Wan, Yulong
Wu, Zhigang
Zhou, Ruohua
Yan, Yonghong
MEASUREMENT TECHNOLOGY AND ENGINEERING RESEARCHES IN INDUSTRY, PTS 1-3, 2013, 333-335 : 742 - +
[33] Extracting information from music audio
Ellis, Daniel P. W.
COMMUNICATIONS OF THE ACM, 2006, 49 (08) : 32 - 37
[34] Extracting information from music audio
Department of Electrical Engineering, Columbia University, NY
Commun ACM, 2006, 8 (32-37):
[35] The Impact of Audio Input Representations on Neural Network based Music Transcription
Cheuk, Kin Wai
Agres, Kat
Herremans, Dorien
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[36] Efficient Vocal Melody Extraction from Polyphonic Music Signals
Yao, G.
Zheng, Y.
Xiao, L.
Ruan, L.
Li, Y.
ELEKTRONIKA IR ELEKTROTECHNIKA, 2013, 19 (06) : 103 - 108
[37] Graph modeling for vocal melody extraction from polyphonic music
Zhang, Weiwei
Yan, Lingyu
Zhang, Qiaoling
Gao, Jinyi
APPLIED ACOUSTICS, 2023, 211
[38] GROUP DELAY BASED MELODY MONOPITCH EXTRACTION FROM MUSIC
Rajan, Rajeev
Murthy, Hema A.
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 186 - 190
[39] Songs From This Season TRUE MELODY MUSIC (T Green)
Murph, John
DOWN BEAT, 2013, 80 (03): : 66 - 66
[40] Ternary Code of Melody and Reliable Audio Watermarking
Absalyamova, Karina S.
Latypov, Rustam Kh
Stolov, Evgeni L.
2019 27TH TELECOMMUNICATIONS FORUM (TELFOR 2019), 2019, : 524 - 527

← 1 2 3 4 5 →