Background-Sound Controllable Voice Source Separation

被引:0
|
作者
Eom, Deokjun [1 ]
Nam, Woo Hyun [1 ]
Kim, Kyung-Rae [1 ]
机构
[1] Samsung Elect, Samsung Res, Suwon, South Korea
来源
关键词
background-sound controllable; voice source separation; speech separation; deep learning;
D O I
10.21437/Interspeech.2023-185
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
There have been various approaches to separate mixed voices. In the real world, input voices contain many different kinds of background sounds but existing methods have not considered the background sounds in model architectures. These approaches are difficult to control the background sounds directly and the voice separation results include the background sounds randomly. In this paper, we propose an extended voice separation framework, background-sound controllable voice source separation that can control the degrees of background sounds of voice separation outputs using a control parameter that ranges from 0 to 1 without additional mixing procedures. Several experiments show the controllability of background sounds on various real world datasets with preserving voice separation performances.
引用
收藏
页码:1698 / 1702
页数:5
相关论文
共 50 条
  • [1] Effective Sound Source Separation Using Single Voice Activity Segments for Binaural Sound
    Noguchi, Wataru
    Kawamura, Arata
    Iiguni, Youji
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 279 - 284
  • [2] Voice extraction in video shooting area by using sound source separation technique
    Togami, Masahito
    Kawaguchi, Yohei
    Yamamoto, Yuji
    Takada, Shintaro
    Sato, Erina
    Obuchi, Yasunari
    Nukaga, Nobuo
    Journal of the Institute of Electronics, Information and Communication Engineers, 2013, 96 (11): : 848 - 855
  • [3] NOVEL SOUND MIXING METHOD FOR VOICE AND BACKGROUND MUSIC
    Owaki, Wataru
    Takahashi, Kota
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 290 - 294
  • [4] THE SOUND OF MY VOICE: SPEAKER REPRESENTATION LOSS FOR TARGET VOICE SEPARATION
    Mun, Seongkyu
    Choe, Soyeon
    Huh, Jaesung
    Chung, Joon Son
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7289 - 7293
  • [5] OPERATING MECHANISM OF THE VOICE SOUND SOURCE (REVIEW)
    GALUNOV, VI
    TAMPEL, IB
    SOVIET PHYSICS ACOUSTICS-USSR, 1981, 27 (03): : 177 - 184
  • [6] A noise-robust voice conversion method with controllable background sounds
    Chen, Lele
    Zhang, Xiongwei
    Li, Yihao
    Sun, Meng
    Chen, Weiwei
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3981 - 3994
  • [7] Foreground-Background Ambient Sound Scene Separation
    Olvera, Michel
    Vincent, Emmanuel
    Serizel, Romain
    Gasso, Gilles
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 281 - 285
  • [8] ON THE PERCEPTUAL RELEVANCE OF OBJECTIVE SOURCE SEPARATION MEASURES FOR SINGING VOICE SEPARATION
    Gupta, Udit
    Moore, Elliot, II
    Lerch, Alexander
    2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
  • [9] Video-Guided Sound Source Separation
    Zhou, Junfeng
    Wang, Feng
    Guo, Di
    Liu, Huaping
    Sun, Fuchun
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT I, 2019, 11740 : 415 - 426
  • [10] Joint Sound Source Separation and Speaker Recognition
    Zegers, Jeroen
    Van Hamme, Hugo
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2228 - 2232