Assessment of Self-Supervised Denoising Methods for Esophageal Speech Enhancement

被引:0
|
作者
Amarjouf, Madiha [1 ]
Ibn Elhaj, El Hassan [1 ]
Chami, Mouhcine [2 ]
Ezzine, Kadria [3 ]
Di Martino, Joseph [3 ]
机构
[1] Natl Inst Posts & Telecommun INPT, Res Lab Telecommun Syst Networks & Serv STRS, Res Team Multimedia Signal & Commun Syst MUSICS, Ave Allal Fassi, Rabat 10112, Morocco
[2] Natl Inst Posts & Telecommun INPT, Res Lab Telecommun Syst Networks & Serv STRS, Res Team Secure & Mixed Architecture Reliable Tech, Ave Allal Fassi, Rabat 10112, Morocco
[3] LORIA Lab Lorrain Rech Informat & Ses Applicat, BP 239, F-54506 Vandoeuvre Les Nancy, France
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 15期
关键词
esophageal speech; self-supervised denoising; speech enhancement; DCUNET; DCUNET-cTSTM; STFT; VoiceFixer; VOICE CONVERSION;
D O I
10.3390/app14156682
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Esophageal speech (ES) is a pathological voice that is often difficult to understand. Moreover, acquiring recordings of a patient's voice before a laryngectomy proves challenging, thereby complicating enhancing this kind of voice. That is why most supervised methods used to enhance ES are based on voice conversion, which uses healthy speaker targets, things that may not preserve the speaker's identity. Otherwise, unsupervised methods for ES are mostly based on traditional filters, which cannot alone beat this kind of noise, making the denoising process difficult. Also, these methods are known for producing musical artifacts. To address these issues, a self-supervised method based on the Only-Noisy-Training (ONT) model was applied, consisting of denoising a signal without needing a clean target. Four experiments were conducted using Deep Complex UNET (DCUNET) and Deep Complex UNET with Complex Two-Stage Transformer Module (DCUNET-cTSTM) for assessment. Both of these models are based on the ONT approach. Also, for comparison purposes and to calculate the evaluation metrics, the pre-trained VoiceFixer model was used to restore the clean wave files of esophageal speech. Even with the fact that ONT-based methods work better with noisy wave files, the results have proven that ES can be denoised without the need for clean targets, and hence, the speaker's identity is retained.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Self-supervised learning for effective denoising of flow fields
    Yu, Linqi
    Yousif, Mustafa Z.
    Zhou, Dan
    Zhang, Meng
    Lee, Jung Sub
    Lim, Hee-Chang
    PHYSICS OF FLUIDS, 2024, 36 (10)
  • [42] Self-supervised Signal Denoising for Magnetic Particle Imaging
    Peng, Huiling
    Li, Yimeng
    Yang, Xin
    Tian, Jie
    Hui, Hui
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [43] Self-supervised learning for denoising of multidimensional MRI data
    Kang, Beomgu
    Lee, Wonil
    Seo, Hyunseok
    Heo, Hye-Young
    Park, Hyunwook
    MAGNETIC RESONANCE IN MEDICINE, 2024, 92 (05) : 1980 - 1994
  • [44] Self-Supervised Joint Learning for pCLE Image Denoising
    Yang, Kun
    Zhang, Haojie
    Qiu, Yufei
    Zhai, Tong
    Zhang, Zhiguo
    SENSORS, 2024, 24 (09)
  • [45] Self-supervised enhanced denoising diffusion for anomaly detection
    Li, Shu
    Yu, Jiong
    Lu, Yi
    Yang, Guangqi
    Du, Xusheng
    Liu, Su
    INFORMATION SCIENCES, 2024, 669
  • [46] Stabilize, Decompose, and Denoise: Self-supervised Fluoroscopy Denoising
    Liu, Ruizhou
    Ma, Qiang
    Cheng, Zhiwei
    Lyu, Yuanyuan
    Wang, Jianji
    Zhou, S. Kevin
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VIII, 2022, 13438 : 13 - 23
  • [47] Denoising Diffusion Autoencoders are Unified Self-supervised Learners
    Xiang, Weilai
    Yang, Hongyu
    Huang, Di
    Wang, Yunhong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15756 - 15766
  • [48] Self-supervised learning for denoising quasiparticle interference data
    Kuijf, Ilse S.
    Tromp, Willem O.
    Benschop, Tjerk
    Ramones, Nino Philip
    Sulangi, Miguel Antonio
    van Nieuwenburg, Evert P. L.
    Allan, Milan P.
    PHYSICAL REVIEW B, 2025, 111 (03)
  • [49] Self-Supervised OCT Denoising: Streamlined Image Enhancement without Clean Targets or Repeated Scans
    Li, Shijie
    Alexopoulos, Palaiologos
    Zambrano, Ronald
    Vellappally, Anse
    Schuman, Joel S.
    Wollstein, Gadi
    Gerig, Guido
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)
  • [50] A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition
    Zhu, Qiu-Shi
    Zhang, Jie
    Zhang, Zi-Qiang
    Dai, Li-Rong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1927 - 1939