Coarse-to-fine Optimization for Speech Enhancement

被引:4
|
作者
Yao, Jian [1 ]
Al-Dahle, Ahmad [1 ]
机构
[1] Apple Inc, Cupertino, CA 95014 USA
来源
关键词
speech enhancement; coarse-to-fine; deep learning; generative model; discriminative model; dynamic perceptual loss;
D O I
10.21437/Interspeech.2019-2792
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we propose the coarse-to-fine optimization for the task of speech enhancement. Cosine similarity loss [1] has proven to be an effective metric to measure similarity of speech signals. However, due to the large variance of the enhanced speech with even the same cosine similarity loss in high dimensional space, a deep neural network learnt with this loss might not be able to predict enhanced speech with good quality. Our coarse-to-fine strategy optimizes the cosine similarity loss for different granularities so that more constraints are added to the prediction from high dimension to relatively low dimension. In this way, the enhanced speech will better resemble the clean speech. Experimental results show the effectiveness of our proposed coarse-to-fine optimization in both discriminative models and generative models. Moreover, we apply the coarse-to-fine strategy to the adversarial loss in generative adversarial network (GAN) and propose dynamic perceptual loss, which dynamically computes the adversarial loss from coarse resolution to fine resolution. Dynamic perceptual loss further improves the accuracy and achieves state-of-the-art results compared with other generative models.
引用
收藏
页码:2743 / 2747
页数:5
相关论文
共 50 条
  • [21] COARSE-TO-FINE TEMPORAL OPTIMIZATION FOR VIDEO RETARGETING BASED ON SEAM CARVING
    Chao, Wei-Lun
    Su, Hsiao-Hang
    Chien, Shao-Yi
    Hsu, Winston
    Ding, Jian-Jiun
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [22] A COARSE-TO-FINE FRAMEWORK FOR LEARNED COLOR ENHANCEMENT WITH NON-LOCAL ATTENTION
    Shan, Chaowei
    Zhang, Zhizheng
    Chen, Zhibo
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 949 - 953
  • [23] Coarse-to-Fine Contrastive Learning on Graphs
    Zhao, Peiyao
    Pan, Yuangang
    Li, Xin
    Chen, Xu
    Tsang, Ivor W.
    Liao, Lejian
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4622 - 4634
  • [24] Coarse-to-fine multiple testing strategies
    Lahouel, Kamel
    Geman, Donald
    Younes, Laurent
    ELECTRONIC JOURNAL OF STATISTICS, 2019, 13 (01): : 1292 - 1328
  • [25] Coarse-to-Fine Deep Kernel Networks
    Sahbi, Hichem
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 1131 - 1139
  • [26] A coarse-to-fine method for shape recognition
    Tang H.-X.
    Wei H.
    Journal of Computer Science and Technology, 2007, 22 (02) : 330 - 334
  • [27] A Coarse-to-Fine Network for Craniopharyngioma Segmentation
    Yu, Yijie
    Zhang, Lei
    Shu, Xin
    Wang, Zizhou
    Chen, Chaoyue
    Xu, Jianguo
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2022, 2022, 13583 : 91 - 100
  • [28] ADVERSARIAL ATTACKS ON COARSE-TO-FINE CLASSIFIERS
    Alkhouri, Ismail R.
    Atia, George K.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2855 - 2859
  • [29] Coarse-to-fine classification and scene labeling
    Geman, D
    NONLINEAR ESTIMATION AND CLASSIFICATION, 2003, 171 : 31 - 48
  • [30] Coarse-to-Fine PatchMatch for Dense Correspondence
    Li, Yunsong
    Hu, Yinlin
    Song, Rui
    Rao, Peng
    Wang, Yangli
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (09) : 2233 - 2245