MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer With One Transformer VAE

被引:9
|
作者
Wu, Shih-Lun [1 ,2 ]
Yang, Yi-Hsuan [1 ,3 ]
机构
[1] Taiwan AI Labs, Taipei 10355, Taiwan
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Natl Taiwan Univ, Taipei 10617, Taiwan
关键词
Measurement; Recurrent neural networks; Music; Transformers; Harmonic analysis; Decoding; Task analysis; Transformer; variational autoencoder (VAE); deep learning; controllable music generation; music style transfer;
D O I
10.1109/TASLP.2023.3270726
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Transformers and variational autoencoders (VAE) have been extensively employed for symbolic (e.g., MIDI) domain music generation. While the former boast an impressive capability in modeling long sequences, the latter allow users to willingly exert control over different parts (e.g., bars) of the music to be generated. In this paper, we are interested in bringing the two together to construct a single model that exhibits both strengths. The task is split into two steps. First, we equip Transformer decoders with the ability to accept segment-level, time-varying conditions during sequence generation. Subsequently, we combine the developed and tested in-attention decoder with a Transformer encoder, and train the resulting MuseMorphose model with the VAE objective to achieve style transfer of long pop piano pieces, in which users can specify musical attributes including rhythmic intensity and polyphony (i.e., harmonic fullness) they desire, down to the bar level. Experiments show that MuseMorphose outperforms recurrent neural network (RNN) based baselines on numerous widely-used metrics for style transfer tasks.
引用
收藏
页码:1953 / 1967
页数:15
相关论文
共 14 条
  • [1] CPS: FULL-SONG AND STYLE-CONDITIONED MUSIC GENERATION WITH LINEAR TRANSFORMER
    Wang, Weipeng
    Li, Xiaobing
    Jin, Cong
    Lu, Di
    Zhou, Qingwen
    Tie, Yun
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,
  • [2] Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs
    Hsiao, Wen-Yi
    Liu, Jen-Yu
    Yeh, Yin-Cheng
    Yang, Yi-Hsuan
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 178 - 186
  • [3] Fine-Grained Image Style Transfer with Visual Transformers
    Wang, Jianbo
    Yang, Huan
    Fu, Jianlong
    Yamasaki, Toshihiko
    Guo, Baining
    COMPUTER VISION - ACCV 2022, PT III, 2023, 13843 : 427 - 443
  • [4] STYLEPTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer
    Lyu, Yiwei
    Liang, Paul Pu
    Pham, Hai
    Hovy, Eduard
    Poczos, Barnabas
    Salakhutdinov, Ruslan
    Morency, Louis-Philippe
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2116 - 2138
  • [5] FINE-GRAINED STYLE CONTROL IN TRANSFORMER-BASED TEXT-TO-SPEECH SYNTHESIS
    Chen, Li-Wei
    Rudnicky, Alexander
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7907 - 7911
  • [6] SELF-SUPERVISED VQ-VAE FOR ONE-SHOT MUSIC STYLE TRANSFER
    Cifka, Ondrej
    Ozerov, Alexey
    Simsekli, Umut
    Richard, Gael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 96 - 100
  • [7] Fine-grained weed recognition using Swin Transformer and two-stage transfer learning
    Wang, Yecheng
    Zhang, Shuangqing
    Dai, Baisheng
    Yang, Sensen
    Song, Haochen
    FRONTIERS IN PLANT SCIENCE, 2023, 14
  • [8] Curriculum-Style Fine-Grained Adaption for Unsupervised Cross-Lingual Dependency Transfer
    Guo, Peiming
    Huang, Shen
    Jiang, Peijie
    Sun, Yueheng
    Zhang, Meishan
    Zhang, Min
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 322 - 332
  • [9] Fine-Grained Position Helps Memorizing More, a Novel Music Compound Transformer Model with Feature Interaction Fusion
    Li, Zuchao
    Gong, Ruhan
    Chen, Yineng
    Su, Kehua
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 5203 - 5212
  • [10] FineStyle: Semantic-Aware Fine-Grained Motion Style Transfer with Dual Interactive-Flow Fusion
    Song, Wenfeng
    Jin, Xingliang
    Li, Shuai
    Chen, Chenglizhao
    Hao, Aimin
    Hou, Xia
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2023, 29 (11) : 4361 - 4371