Enhancing Robustness of Speech Watermarking Using a Transformer-Based Framework Exploiting Acoustic Features

被引:0
|
作者
Tong, Chuxuan [1 ]
Natgunanathan, Iynkaran [1 ]
Xiang, Yong [1 ]
Li, Jianhua [1 ]
Zong, Tianrui [1 ,2 ]
Zheng, Xi [3 ]
Gao, Longxiang [4 ,5 ]
机构
[1] Deakin Univ, Sch Informat Technol, Geelong, Vic 3125, Australia
[2] Kexin Technol Ltd, Beijing 100020, Peoples R China
[3] Macquarie Univ, Dept Comp, Sydney, NSW 2109, Australia
[4] Qilu Univ Technol, Shandong Acad Sci, Shandong Comp Sci Ctr, Key Lab Comp Power Network & Informat Secur,Minist, Jinan 250316, Peoples R China
[5] Shandong Fundamental Res Ctr Comp Sci, Shandong Prov Key Lab Comp Power Internet & Serv C, Jinan 250316, Peoples R China
基金
澳大利亚研究理事会;
关键词
Watermarking; Feature extraction; Robustness; Decoding; Training; Acoustics; Transformers; Perturbation methods; Generators; Data mining; Audio watermarking; deep neural networks; watermark attacks; TIME-SCALE MODIFICATION; AUDIO WATERMARKING; ATTACKS; SCHEME; DOMAIN;
D O I
10.1109/TASLP.2024.3486206
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Digital watermarking serves as an effective approach for safeguarding speech signal copyrights, achieved by the incorporation of ownership information into the original signal and its subsequent extraction from the watermarked signal. While traditional watermarking methods can embed and extract watermarks successfully when the watermarked signals are not exposed to severe alterations, these methods cannot withstand attacks such as de-synchronization. In this work, we introduce a novel transformer-based framework designed to enhance the imperceptibility and robustness of speech watermarking. This framework incorporates encoders and decoders built on multi-scale transformer blocks to effectively capture local and long-range features from inputs, such as acoustic features extracted by Short-Time Fourier Transformation (STFT). Further, a deep neural networks (DNNs) based generator, notably the Transformer architecture, is employed to adaptively embed imperceptible watermarks. These perturbations serve as a step for simulating noise, thereby bolstering the watermark robustness during the training phase. Experimental results show the superiority of our proposed framework in terms of watermark imperceptibility and robustness against various watermark attacks. When compared to the currently available related techniques, the framework exhibits an eightfold increase in embedding rate. Further, it also presents superior practicality with scalability and reduced inference time of DNN models.
引用
收藏
页码:4822 / 4837
页数:16
相关论文
共 50 条
  • [41] Transcribing Paralinguistic Acoustic Cues to Target Language Text in Transformer-based Speech-to-Text Translation
    Tokuyama, Hirotaka
    Sakti, Sakriani
    Sudoh, Katsuhito
    Nakamura, Satoshi
    INTERSPEECH 2021, 2021, : 2262 - 2266
  • [42] Enhancing robustness of zero resource children's speech recognition system through bispectrum based front-end acoustic features
    Shahnawazuddin, S.
    Kumar, Avinash
    Kumar, Saurabh
    Ahmad, Waquar
    DIGITAL SIGNAL PROCESSING, 2021, 118
  • [43] A Transformer-Based Framework for Payload Malware Detection and Classification
    Stein, Kyle
    Mahyari, Arash
    Francia, Guillermo, III
    El-Sheikh, Eman
    2024 IEEE 5TH ANNUAL WORLD AI IOT CONGRESS, AIIOT 2024, 2024, : 0105 - 0111
  • [44] A Transformer-Based Bridge Structural Response Prediction Framework
    Li, Ziqi
    Li, Dongsheng
    Sun, Tianshu
    SENSORS, 2022, 22 (08)
  • [45] Transformer-Based Multilingual Speech Emotion Recognition Using Data Augmentation and Feature Fusion
    Al-onazi, Badriyya B.
    Nauman, Muhammad Asif
    Jahangir, Rashid
    Malik, Muhmmad Mohsin
    Alkhammash, Eman H.
    Elshewey, Ahmed M.
    APPLIED SCIENCES-BASEL, 2022, 12 (18):
  • [46] A Transformer-Based Framework for Biomedical Information Retrieval Systems
    Hall, Karl
    Jayne, Chrisina
    Chang, Victor
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 317 - 331
  • [47] Enhancing Speech by Reconstruction from Robust Acoustic Features
    Harding, Philip
    Milner, Ben
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 942 - 945
  • [48] Enhancing Microseismic Signal Classification in Metal Mines Using Transformer-Based Deep Learning
    Peng, Pingan
    Lei, Ru
    Wang, Jinmiao
    SUSTAINABILITY, 2023, 15 (20)
  • [49] Enhancing Fake News Detection in Romanian Using Transformer-Based Back Translation Augmentation
    Bucos, Marian
    Dragulescu, Bogdan
    APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [50] Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
    Ghourabi, Abdallah
    Alohaly, Manar
    SENSORS, 2023, 23 (08)