Melody transcription from music audio:: Approaches and evaluation

被引:86
|
作者
Poliner, Graham E. [1 ]
Ellis, Daniel P. W.
Ehmann, Andreas F.
Gomez, Emilia
Streich, Sebastian
Ong, Beesuan
机构
[1] Columbia Univ, Dept Elect Engn, LabROSA, New York, NY 10027 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
[3] Univ Pompeu Fabra, Mus Techol Grp, Barcelona 08002, Spain
基金
美国国家科学基金会; 美国安德鲁·梅隆基金会;
关键词
audio; evaluation; melody transcription; music;
D O I
10.1109/TASL.2006.889797
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although the process of analyzing an audio recording of a music performance is complex and difficult even for a human listener, there are limited forms of information that may be tractably extracted and yet still enable interesting applications. We discuss melody-roughly, the part a listener might whistle or hum-as one such reduced descriptor of music audio, and consider how to define it, and what use it might be. We go on to describe the results of full-scale evaluations of melody transcription systems conducted in 2004 and 2005, including an overview of the systems submitted, details of how the evaluations were conducted, and a discussion of the results. For our definition of melody, current systems can achieve around 70% correct transcription at the frame level, including distinguishing between the presence or absence of the melody. Melodies transcribed at this level are readily recognizable, and show promise for practical applications.
引用
收藏
页码:1247 / 1256
页数:10
相关论文
共 50 条
  • [1] FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC
    Zhu, Bilei
    Wu, Fuzhang
    Li, Ke
    Wu, Yongjian
    Huang, Feiyue
    Wu, Yunsheng
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 296 - 300
  • [2] Extracting vocal melody from Karaoke music audio
    Zhu, YW
    Gao, S
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 1111 - 1114
  • [3] Audio Melody Extraction from Heterophonic Turkish Maqam Music
    Simsek, Berrak Ozturk
    Akan, Aydin
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [4] Audio Melody Extraction from Monophonic Turkish Maqam Music
    Simsek, Berrak Ozturk
    Akan, Aydin
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [5] Singing Transcription from Polyphonic Music Using Melody Contour Filtering
    He, Zhuang
    Feng, Yin
    APPLIED SCIENCES-BASEL, 2021, 11 (13):
  • [6] Evaluation of a melody transcription system
    McNab, RJ
    Smith, LA
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 819 - 822
  • [7] Probabilistic approach to automatic music transcription from audio signals
    Miyamoto, Kenichi
    Kameoka, Hirokazu
    Takeda, Haruto
    Nishimoto, Takuya
    Sagayama, Shigeki
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 697 - +
  • [8] A Multimodal Approach for Percussion Music Transcription from Audio and Video
    Marenco, Bernardo
    Fuentes, Magdalena
    Lanzaro, Florencia
    Rocamora, Martin
    Gomez, Alvaro
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 92 - 99
  • [9] Multimodal image and audio music transcription
    de la Fuente, Carlos
    Valero-Mas, Jose J.
    Castellanos, Francisco J.
    Calvo-Zaragoza, Jorge
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (01) : 77 - 84
  • [10] Multimodal image and audio music transcription
    Carlos de la Fuente
    Jose J. Valero-Mas
    Francisco J. Castellanos
    Jorge Calvo-Zaragoza
    International Journal of Multimedia Information Retrieval, 2022, 11 : 77 - 84