Self-supervised Spoofing Audio Detection Scheme

被引:7
|
作者
Jiang, Ziyue [1 ]
Zhu, Hongcheng [1 ]
Peng, Li [1 ]
Ding, Wenbing [1 ]
Ren, Yanzhen [1 ,2 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan, Peoples R China
[2] Minist Educ, Key Lab Aerosp Informat Secur & Trusted Comp, Beijing, Peoples R China
来源
关键词
self-supervised learning; ASVspoofing detection; anti-spoofing; deepfake;
D O I
10.21437/Interspeech.2020-1760
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
With the development of deep generation technology, spoofing audio technology based on speech synthesis and speech conversion is closer to reality, which challenges the credibility of the media in social networks. This paper proposes a self-supervised spoofing audio detection scheme(SSAD). In SSAD, eight convolutional blocks are used to capture the local feature of the audio signal. The temporal convolutional network (TCN) is used to capture the context features and realize the operation in parallel. Three regression workers and one binary worker are designed to achieve better performance in fake and spoofing audio detection. The experimental results on ASVspoof 2019 dataset show that the detection accuracy of SSAD outperforms the state-of-art. It shows that the self-supervised method is effective for the task of spoofing audio detection.
引用
收藏
页码:4223 / 4227
页数:5
相关论文
共 50 条
  • [31] Audio Mixing Inversion via Embodied Self-supervised Learning
    Zhou, Haotian
    Yu, Feng
    Wu, Xihong
    MACHINE INTELLIGENCE RESEARCH, 2024, 21 (01) : 55 - 62
  • [32] SelfRemaster: Self-Supervised Speech Restoration for Historical Audio Resources
    Saeki, Takaaki
    Takamichi, Shinnosuke
    Nakamura, Tomohiko
    Tanji, Naoko
    Saruwatari, Hiroshi
    IEEE ACCESS, 2023, 11 : 144831 - 144843
  • [33] SELF-SUPERVISED AUDIO-VISUAL CO-SEGMENTATION
    Rouditchenko, Andrew
    Zhao, Hang
    Gan, Chuang
    McDermott, Josh
    Torralba, Antonio
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2357 - 2361
  • [34] SELF-SUPERVISED LEARNING FOR AUDIO-VISUAL SPEAKER DIARIZATION
    Ding, Yifan
    Xu, Yong
    Zhang, Shi-Xiong
    Cong, Yahuan
    Wang, Liqiang
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4367 - 4371
  • [35] Universal Sound Separation with Self-Supervised Audio Masked Autoencoder
    Zhao, Junqi
    Liu, Xubo
    Zhao, Jinzheng
    Yuan, Yi
    Kong, Qiuqiang
    Plumbley, Mark D.
    Wang, Wenwu
    32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 1 - 5
  • [36] Robust Self-Supervised Audio-Visual Speech Recognition
    Shi, Bowen
    Hsu, Wei-Ning
    Mohamed, Abdelrahman
    INTERSPEECH 2022, 2022, : 2118 - 2122
  • [37] Visually Assisted Self-supervised Audio Speaker Localization and Tracking
    Zhao, Jinzheng
    Wu, Peipei
    Goudarzi, Shidrokh
    Liu, Xubo
    Sun, Jianyuan
    Xu, Yong
    Wang, Wenwu
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 787 - 791
  • [38] Audio-visual self-supervised representation learning: A survey
    Alsuwat, Manal
    Al-Shareef, Sarah
    Alghamdi, Manal
    NEUROCOMPUTING, 2025, 634
  • [39] Audio Mixing Inversion via Embodied Self-supervised Learning
    Haotian Zhou
    Feng Yu
    Xihong Wu
    Machine Intelligence Research, 2024, 21 : 55 - 62
  • [40] Noise-Tolerant Self-Supervised Learning for Audio-Visual Voice Activity Detection
    Kim, Ui-Hyun
    INTERSPEECH 2021, 2021, : 326 - 330