Melody transcription from music audio:: Approaches and evaluation

被引：86

作者：

Poliner, Graham E. ^{[1
]}

Ellis, Daniel P. W.

Ehmann, Andreas F.

Gomez, Emilia

Streich, Sebastian

Ong, Beesuan

机构：

[1] Columbia Univ, Dept Elect Engn, LabROSA, New York, NY 10027 USA

[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA

[3] Univ Pompeu Fabra, Mus Techol Grp, Barcelona 08002, Spain

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 04期

基金：

美国国家科学基金会; 美国安德鲁·梅隆基金会;

关键词：

audio; evaluation; melody transcription; music;

D O I：

10.1109/TASL.2006.889797

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Although the process of analyzing an audio recording of a music performance is complex and difficult even for a human listener, there are limited forms of information that may be tractably extracted and yet still enable interesting applications. We discuss melody-roughly, the part a listener might whistle or hum-as one such reduced descriptor of music audio, and consider how to define it, and what use it might be. We go on to describe the results of full-scale evaluations of melody transcription systems conducted in 2004 and 2005, including an overview of the systems submitted, details of how the evaluations were conducted, and a discussion of the results. For our definition of melody, current systems can achieve around 70% correct transcription at the frame level, including distinguishing between the presence or absence of the melody. Melodies transcribed at this level are readily recognizable, and show promise for practical applications.

引用

页码：1247 / 1256

页数：10

共 50 条

[1] FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC
Zhu, Bilei
Wu, Fuzhang
Li, Ke
Wu, Yongjian
Huang, Feiyue
Wu, Yunsheng
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 296 - 300
[2] Extracting vocal melody from Karaoke music audio
Zhu, YW
Gao, S
2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 1111 - 1114
[3] Audio Melody Extraction from Heterophonic Turkish Maqam Music
Simsek, Berrak Ozturk
Akan, Aydin
29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
[4] Audio Melody Extraction from Monophonic Turkish Maqam Music
Simsek, Berrak Ozturk
Akan, Aydin
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
[5] Singing Transcription from Polyphonic Music Using Melody Contour Filtering
He, Zhuang
Feng, Yin
APPLIED SCIENCES-BASEL, 2021, 11 (13):
[6] Evaluation of a melody transcription system
McNab, RJ
Smith, LA
2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 819 - 822
[7] Probabilistic approach to automatic music transcription from audio signals
Miyamoto, Kenichi
Kameoka, Hirokazu
Takeda, Haruto
Nishimoto, Takuya
Sagayama, Shigeki
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 697 - +
[8] A Multimodal Approach for Percussion Music Transcription from Audio and Video
Marenco, Bernardo
Fuentes, Magdalena
Lanzaro, Florencia
Rocamora, Martin
Gomez, Alvaro
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 92 - 99
[9] Multimodal image and audio music transcription
de la Fuente, Carlos
Valero-Mas, Jose J.
Castellanos, Francisco J.
Calvo-Zaragoza, Jorge
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (01) : 77 - 84
[10] Multimodal image and audio music transcription
Carlos de la Fuente
Jose J. Valero-Mas
Francisco J. Castellanos
Jorge Calvo-Zaragoza
International Journal of Multimedia Information Retrieval, 2022, 11 : 77 - 84

← 1 2 3 4 5 →