DeepDet: YAMNet with BottleNeck Attention Module (BAM) for TTS synthesis detection

被引:0
|
作者
Rabbia Mahum
Aun Irtaza
Ali Javed
Haitham A. Mahmoud
Haseeb Hassan
机构
[1] UET Taxila,Computer Science Department
[2] UET Taxila,Software Engineering Department
[3] King Saud University,Industrial Engineering Department, College of Engineering
[4] Shenzhen Technology University (SZTU),College of Big Data and Internet
关键词
Deep learning; Spoofing detector; Fake speech detection;
D O I
暂无
中图分类号
学科分类号
摘要
Spoofed speeches are becoming a big threat to society due to advancements in artificial intelligence techniques. Therefore, there must be an automated spoofing detector that can be integrated into automatic speaker verification (ASV) systems. In this study, we recommend a novel and robust model, named DeepDet, based on deep-layered architecture, to categorize speech into two classes: spoofed and bonafide. DeepDet is an improved model based on Yet Another Mobile Network (YAMNet) employing a customized MobileNet combined with a bottleneck attention module (BAM). First, we convert audio into mel-spectrograms that consist of time–frequency representations on mel-scale. Second, we trained our deep layered model using the extracted mel-spectrograms on a Logical Access (LA) set, including synthesized speeches and voice conversions of the ASVspoof-2019 dataset. In the end, we classified the audios, utilizing our trained binary classifier. More precisely, we utilized the power of layered architecture and guided attention that can discern the spoofed speech from bonafide samples. Our proposed improved model employs depth-wise linearly separate convolutions, which makes our model lighter weight than existing techniques. Furthermore, we implemented extensive experiments to assess the performance of the suggested model using the ASVspoof 2019 corpus. We attained an equal error rate (EER) of 0.042% on Logical Access (LA), whereas 0.43% on Physical Access (PA) attacks. Therefore, the performance of the proposed model is significant on the ASVspoof 2019 dataset and indicates the effectiveness of the DeepDet over existing spoofing detectors. Additionally, our proposed model is robust enough that can identify the unseen spoofed audios and classifies the several attacks accurately.
引用
收藏
相关论文
共 50 条
  • [1] DeepDet: YAMNet with BottleNeck Attention Module (BAM) TTS synthesis detection
    Mahum, Rabbia
    Irtaza, Aun
    Javed, Ali
    Mahmoud, Haitham A.
    Hassan, Haseeb
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
  • [2] DeepDet: YAMNet with BottleNeck Attention Module (BAM) for TTS synthesis detection (vol 2024, 18, 2024)
    Mahum, Rabbia
    Irtaza, Aun
    Javed, Ali
    Mahmoud, Haitham A.
    Hassan, Haseeb
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
  • [3] Eye diseases detection using deep learning with BAM attention module
    Zia, Amna
    Mahum, Rabbia
    Ahmad, Nabeel
    Awais, Muhammad
    Alshamrani, Ahmad M.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (20) : 59061 - 59084
  • [4] BAM: A Bidirectional Attention Module for Masked Face Recognition
    Shakeel, M. Saad
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [5] A deep learning strategy for automatic congestive heart failure detection using novel bottleneck attention module
    Wang, Jibin
    Guo, Xingtian
    APPLIED INTELLIGENCE, 2024, 54 (17-18) : 8120 - 8131
  • [6] EDL-Det: A Robust TTS Synthesis Detector Using VGG19-Based YAMNet and Ensemble Learning Block
    Mahum, Rabbia
    Irtaza, Aun
    Javed, Ali
    IEEE ACCESS, 2023, 11 : 134701 - 134716
  • [7] Facial Expression Recognition Using a Semantic-Based Bottleneck Attention Module
    Zhang, Shengfu
    Xiao, Zhongjie
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2024, 20 (01)
  • [8] Light-weight infrared small target detection combining cross-scalefeature fusion with bottleneck attention module br
    Lin, Zai-Ping
    Li, Bo-Yang
    Li, Miao
    Wang, Long-Guang
    Wu, Tian-Hao
    Luo, Yi-Hang
    Xiao, Chao
    Li, Ruo-jing
    Wei, An
    JOURNAL OF INFRARED AND MILLIMETER WAVES, 2022, 41 (06) : 1102 - 1112
  • [9] InfraEyeNet: Infrared eye landmark detection network with modified bottleneck module
    Lee, Seungkeon
    Park, Yeongje
    Lee, Eui Chul
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (06)
  • [10] CONTEXT ATTENTION MODULE FOR HUMAN HAND DETECTION
    Xie, Zhihuai
    Wang, Shaojie
    Zhao, Wentian
    Guo, Zhenhua
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, : 555 - 560