Exploring Speaker Age Estimation on Different Self-Supervised Learning Models

被引:0
|
作者
Truong, Duc-Tuan [1 ]
Anh, Tran The [1 ]
Siong, Chng Eng [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised learning (SSL) has played an important role in various tasks in the field of speech and audio processing. However, there is limited research on adapting these SSL models to predict the speaker's age and gender using speech signals. In this paper, we investigate seven SSL models, namely PASE+, NPC, wav2vec 2.0, XLSR, HuBERT, WavLM, and data2vec in the joint age estimation and gender classification task on the TIMIT corpus. Additionally, we also study the effect of using different hidden encoder layers within these models on the age estimation result. Furthermore, we evaluate how the performance of different SSL models varies in predicting the speaker's age under simulated noisy conditions. The simulated noisy speech is created by mixing the clean utterance from the TIMIT test set with random noises from the Music and Noise category of the MUSAN corpus on multiple levels of signal-to-noise ratio (SNR). Our findings confirm that a recent SSL model, namely WavLM can obtain better and more robust speech representation than wav2vec 2.0 SSL model used in the current state-of-the-art (SOTA) approach by achieving a 3.6% and 11.32% mean average error (MAE) reduction on the clean and 5dB SNR TIMIT test set.
引用
收藏
页码:1950 / 1955
页数:6
相关论文
共 50 条
  • [21] SELF-SUPERVISED LEARNING BASED DOMAIN ADAPTATION FOR ROBUST SPEAKER VERIFICATION
    Chen, Zhengyang
    Wang, Shuai
    Qian, Yanmin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5834 - 5838
  • [22] CONTRASTIVE SELF-SUPERVISED LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Zhang, Haoran
    Zou, Yuexian
    Wang, Helin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6713 - 6717
  • [23] Self-Supervised Representation Learning With Path Integral Clustering for Speaker Diarization
    Singh, Prachi
    Ganapathy, Sriram
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1639 - 1649
  • [24] SELF-SUPERVISED LEARNING FOR INSAR PHASE AND COHERENCE ESTIMATION
    Sica, Francescopaolo
    Sanjeevamurthy, Pavan Muguda
    Schmitt, Michael
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 722 - 725
  • [25] SELF-SUPERVISED LEARNING FOR HUMAN POSE ESTIMATION IN SPORTS
    Ludwig, Katja
    Scherer, Sebastian
    Einfalt, Moritz
    Lienhart, Rainer
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [26] Exploring Attention and Self-Supervised Learning Mechanism for Graph Similarity Learning
    Wen, Guangqi
    Gao, Xin
    Tan, Wenhui
    Cao, Peng
    Yang, Jinzhu
    Li, Weiping
    Zaiane, Osmar R.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [27] Self-supervised probabilistic models for exploring shape memory alloys
    Wang, Yiding
    Li, Tianqing
    Zong, Hongxiang
    Ding, Xiangdong
    Xu, Songhua
    Sun, Jun
    Lookman, Turab
    NPJ COMPUTATIONAL MATERIALS, 2024, 10 (01)
  • [28] Prototype Division for Self-Supervised Speaker Verification
    Zhao, Zhenduo
    Li, Zhuo
    Zhang, Xueshuai
    Wang, Wenchao
    Zhang, Pengyuan
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 880 - 884
  • [29] Contrastive Self-Supervised Learning: A Survey on Different Architectures
    Khan, Adnan
    AlBarri, Sarah
    Manzoor, Muhammad Arslan
    PROCEEDINGS OF 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (ICAI 2022), 2022, : 1 - 6
  • [30] Exploring Set Similarity for Dense Self-supervised Representation Learning
    Wang, Zhaoqing
    Li, Qiang
    Zhang, Guoxin
    Wan, Pengfei
    Zheng, Wen
    Wang, Nannan
    Gong, Mingming
    Liu, Tongliang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16569 - 16578