Groove2Groove: One-Shot Music Style Transfer With Supervision From Synthetic Data

被引:22
|
作者
Cifka, Ondrej [1 ]
Simsekli, Umut [1 ]
Richard, Gael [1 ]
机构
[1] Inst Polytech Paris, Telecom Paris, Informat Proc & Commun Lab LTCI, F-91120 Palaiseau, France
基金
欧盟地平线“2020”;
关键词
Music; Task analysis; Speech processing; Training; Neural networks; Supervised learning; Instruments; Style transfer; symbolic music; synthetic data; deep learning; recurrent neural networks;
D O I
10.1109/TASLP.2020.3019642
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Style transfer is the process of changing the style of an image, video, audio clip or musical piece so as to match the style of a given example. Even though the task has interesting practical applications within the music industry, it has so far received little attention from the audio and music processing community. In this article, we present Groove2Groove, a one-shot style transfer method for symbolic music, focusing on the case of accompaniment styles in popular music and jazz. We propose an encoder-decoder neural network for the task, along with a synthetic data generation scheme to supply it with parallel training examples. This synthetic parallel data allows us to tackle the style transfer problem using end-to-end supervised learning, employing powerful techniques used in natural language processing. We experimentally demonstrate the performance of the model on style transfer using existing and newly proposed metrics, and also explore the possibility of style interpolation.
引用
收藏
页码:2638 / 2650
页数:13
相关论文
共 34 条
  • [1] MICAUGMENT: ONE-SHOT MICROPHONE STYLE TRANSFER
    Borsos, Zalan
    Li, Yunpeng
    Gfeller, Beat
    Tagliasacchi, Marco
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3400 - 3404
  • [2] SELF-SUPERVISED VQ-VAE FOR ONE-SHOT MUSIC STYLE TRANSFER
    Cifka, Ondrej
    Ozerov, Alexey
    Simsekli, Umut
    Richard, Gael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 96 - 100
  • [3] ONE-SHOT VOICE CONVERSION FOR STYLE TRANSFER BASED ON SPEAKER ADAPTATION
    Wang, Zhichao
    Xie, Qicong
    Li, Tao
    Du, Hongqiang
    Xie, Lei
    Zhu, Pengcheng
    Bi, Mengxiao
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6792 - 6796
  • [4] ONE-SHOT PARAMETRIC AUDIO PRODUCTION STYLE TRANSFER WITH APPLICATION TO FREQUENCY EQUALIZATION
    Mimilakis, Stylianos, I
    Bryan, Nicholas J.
    Smaragdis, Paris
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 256 - 260
  • [5] "One-Shot" Super-Resolution via Backward Style Transfer for Fast High-Resolution Style Transfer
    Cheng, Jikang
    Han, Zhen
    Wang, Zhongyuan
    Chen, Liang
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1485 - 1489
  • [6] Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation
    Tomar, Devavrat
    Bozorgtabar, Behzad
    Lortkipanidze, Manana
    Vray, Guillaume
    Rad, Mohammad Saeed
    Thiran, Jean-Philippe
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1737 - 1747
  • [7] UNET-TTS: IMPROVING UNSEEN SPEAKER AND STYLE TRANSFER IN ONE-SHOT VOICE CLONING
    Li, Rui
    Pu, Dong
    Huang, Minnie
    Huang, Bill
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8327 - 8331
  • [8] Diff-TST: Diffusion model for one-shot text-image style transfer
    Pang, Sizhe
    Chen, Xinyuan
    Xie, Yangchen
    Zhan, Hongjian
    Yin, Bing
    Lu, Yue
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 263
  • [9] STYLETTS-VC: ONE-SHOT VOICE CONVERSION BY KNOWLEDGE TRANSFER FROM STYLE-BASED TTS MODELS
    Li, Yinghao Aaron
    Han, Cong
    Mesgarani, Nima
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 920 - 927
  • [10] Meta-Reinforced Synthetic Data for One-Shot Fine-Grained Visual Recognition
    Tsutsui, Satoshi
    Fu, Yanwei
    Crandall, David
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32