Graph modeling for vocal melody extraction from polyphonic music

被引：1

作者：

Zhang, Weiwei ^{[1
]}

Yan, Lingyu ^{[1
]}

Zhang, Qiaoling ^{[2
]}

Gao, Jinyi ^{[1
]}

机构：

[1] Dalian Maritime Univ, Informat Sci & Technol Coll, Dalian 116026, Peoples R China

[2] Zhejiang Sci Tech Univ, Sch Informat & Elect, Hangzhou 310018, Peoples R China

来源：

APPLIED ACOUSTICS | 2023年 / 211卷

关键词：

Vocal melody extraction; Graph modeling; Graph convolutional network; Shift-invariant graph structure; AUDIO;

D O I：

10.1016/j.apacoust.2023.109491

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, a vocal melody extraction method based on graph modeling is proposed. First, constant-Q transform of mixed audio signal is applied. Then, amplitude spectra of several adjacent frames are concatenated together to construct the input feature. Afterwards, an undirected graph is constructed to model the melody extraction issue, and the frame-wise melodic pitches are estimated by a graph convolutional network (GCN), where the pitch estimation issue is regarded as a multi-class classification problem. The frequency bins are viewed as nodes and the underlying connection relationships of the frequency bins are defined as edges. Finally, the quantized frame-wise pitches are fine-tuned according to the salience function defined at a certain range of the smoothed melody trajectory based on the pitches estimated by GCN. The proposed method addresses the vocal melody extraction issue in an explainable way where the edges are defined according to the underlying connection relationships of different frequency bins. Experimental results demonstrate that the proposed method achieves good performance with light weight parameters.& COPY; 2023 Elsevier Ltd. All rights reserved.

引用

页数：13

共 50 条

[21] Vocal Pitch Extraction in Polyphonic Music using Convolutional Residual Network
Dong, Mingye
Wu, Jie
Luan, Jian
INTERSPEECH 2019, 2019, : 2010 - 2014
[22] FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC
Zhu, Bilei
Wu, Fuzhang
Li, Ke
Wu, Yongjian
Huang, Feiyue
Wu, Yunsheng
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 296 - 300
[23] Singing Transcription from Polyphonic Music Using Melody Contour Filtering
He, Zhuang
Feng, Yin
APPLIED SCIENCES-BASEL, 2021, 11 (13):
[24] Extracting vocal melody from Karaoke music audio
Zhu, YW
Gao, S
2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 1111 - 1114
[25] Automatic Transcription of Polyphonic Vocal Music
McLeod, Andrew
Schramm, Rodrigo
Steedman, Mark
Benetos, Emmanouil
APPLIED SCIENCES-BASEL, 2017, 7 (12):
[26] Towards a Computational Model of Melody Identification in Polyphonic Music
Madsen, Soren Tjagvad
Widmer, Gerhard
20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 459 - 464
[27] Gesture and melody in Indian vocal music
Rahaim, Matt
Gesture, 2008, 8 (03) : 325 - 347
[28] MCSSME: Multi-Task Contrastive Learning for Semi-supervised Singing Melody Extraction from Polyphonic Music
Yu, Shuai
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 365 - 373
[29] MTANet: Multi-band Time-frequency Attention Network for Singing Melody Extraction from Polyphonic Music
Gao, Yuan
Hu, Ying
Wang, Liusong
Huang, Hao
He, Liang
INTERSPEECH 2023, 2023, : 5396 - 5400
[30] Automatic transcription of melody, bass line, and chords in polyphonic music
Ryynanen, Matti P.
Klapuri, Anssi P.
COMPUTER MUSIC JOURNAL, 2008, 32 (03) : 72 - 86

← 1 2 3 4 5 →