Sound Source Separation for Robot Audition using Deep Learning

被引:0
|
作者
Noda, Kuniaki [1 ]
Hashimoto, Naoya [1 ]
Nakadai, Kazuhiro [2 ]
Ogata, Tetsuya [1 ]
机构
[1] Waseda Univ, Grad Sch Fundamental Sci & Engn, Tokyo 1698555, Japan
[2] Honda Res Inst Japan Co Ltd, Saitama 3510114, Japan
关键词
D O I
暂无
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Noise robust speech recognition is crucial for effective human-machine interaction in real-world environments. Sound source separation (SSS) is one of the most widely used approaches for addressing noise robust speech recognition by extracting a target speaker's speech signal while suppressing simultaneous unintended signals. However, conventional SSS algorithms, such as independent component analysis or nonlinear principal component analysis, are limited in modeling complex projections with scalability. Moreover, conventional systems required designing an independent subsystem for noise reduction (NR) in addition to the SSS. To overcome these issues, we propose a deep neural network (DNN) framework for modeling the separation function (SF) of an SSS system. By training a DNN to predict clean sound features of a target sound from corresponding multichannel deteriorated sound feature inputs, we enable the DNN to model the SF for extracting the target sound without prior knowledge regarding the acoustic properties of the surrounding environment. Moreover, the same DNN is trained to function simultaneously as a NR filter. Our proposed SSS system is evaluated using an isolated word recognition task and a large vocabulary continuous speech recognition task when either nondirectional or directional noise is accumulated in the target speech. Our evaluation results demonstrate that DNN performs noticeably better than the baseline approach, especially when directional noise is accumulated with a low signal-to-noise ratio.
引用
收藏
页码:389 / 394
页数:6
相关论文
共 50 条
  • [21] Three ring microphone array for 3D sound localization and separation for mobile robot audition
    Tamai, Y
    Sasaki, Y
    Kagami, S
    Mizoguchi, H
    2005 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2005, : 903 - 908
  • [22] Sound source localization for auditory perception of a humanoid robot using deep neural networks
    Boztas, G.
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (09): : 6801 - 6811
  • [23] Sound source localization for auditory perception of a humanoid robot using deep neural networks
    G. Boztas
    Neural Computing and Applications, 2023, 35 : 6801 - 6811
  • [24] Recursive Sound Source Separation with Deep Learning-based Beamforming for Unknown Number of Sources
    Munakata, Hokuto
    Takeda, Ryu
    Komatani, Kazunori
    INTERSPEECH 2023, 2023, : 1688 - 1692
  • [25] Deep-learning-based Single-channel Sound Source Separation in Noisy Environments
    Furuya, Ken'ichi
    Miura, Iori
    2024 11TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-TAIWAN, ICCE-TAIWAN 2024, 2024, : 71 - 72
  • [26] Daily Sound Recognition Using Pitch-Cluster-Maps for Mobile Robot Audition
    Sasaki, Yoko
    Kaneyoshi, Masahito
    Kagami, Satoshi
    Mizoguchi, Hiroshi
    Enomoto, Tadashi
    2009 IEEE-RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2009, : 2724 - 2729
  • [27] Sound Source Separation Using Spatio-temporal Sound Pressure Distribution Images and Machine Learning
    Ozawa, Kenji
    Shiozawa, Koichiro
    Ise, Tomohiko
    PROCEEDINGS 2019 AMITY INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AICAI), 2019, : 54 - 60
  • [28] VISUAL SOUND SOURCE SEPARATION WITH PARTIAL SUPERVISION LEARNING
    Wang, Huasen
    Gao, Lingling
    Tan, Qianchao
    Ji, Luping
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2127 - 2131
  • [29] Application of deep learning for accurate source localization using sound intensity vector
    Jeong, Iljoo
    Jung, In-Jee
    Lee, Seungchul
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2024, 43 (01): : 72 - 77
  • [30] Sound Source Distance Estimation Using Deep Learning: An Image Classification Approach
    Yiwere, Mariam
    Rhee, Eun Joo
    SENSORS, 2020, 20 (01)