NON-INTRUSIVE SPEECH QUALITY ASSESSMENT WITH MULTI-TASK LEARNING BASED ON TENSOR NETWORK

被引:1
|
作者
Liu, Hanyue [1 ]
Liu, Miao [1 ]
Wang, Jing [1 ]
Xie, Xiang [1 ]
Yang, Lidong [2 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] Inner Mongolia Univ Sci & Technol, Baotou, Peoples R China
关键词
speech quality assessment; tensor network; matrix product state; multi-task learning;
D O I
10.1109/ICASSP48485.2024.10447695
中图分类号
学科分类号
摘要
With the growing significance of non-intrusive speech quality assessment in speech systems, existing methods predominantly rely on neural networks to extract low-order features. Typically, these features undergo a low-dimensional linear transformation, yielding the network's output. However, the intercorrelation between feature points is often overlooked. In this paper, we explore the concept of kernel method, which maps features into high dimensional space through dot product, in order to enhance the extraction of relationships among all feature points. Considering the unique advantages of tensors in complex data representation, we extend the utilization of tensor network and propose a novel framework that incorporates a matrix product state (MPS) layer to predict mean opinion score (MOS). By integrating the MPS layer, our model can transform low-order features into higher-order representations, facilitating linear transformation in a high dimensional space without increasing the number of parameters. Furthermore, we propose a loss function that concurrently assesses regression and classification biases, along with correlation with real MOS labels. Experimental results demonstrate that our proposed model consistently outperforms the baseline system across all evaluation metrics and surpasses state-of-the-art models on the test set.
引用
收藏
页码:851 / 855
页数:5
相关论文
共 50 条
  • [1] Non-intrusive Speech Quality Assessment with a Multi-Task Learning based Subband Adaptive Attention Temporal Convolutional Neural Network
    Shu, Xiaofeng
    Chen, Yanjie
    Shang, Chuxiang
    Zhao, Yan
    Zhao, Chengshuai
    Zhu, Yehang
    Huang, Chuanzeng
    Wang, Yuxuan
    INTERSPEECH 2022, 2022, : 3298 - 3302
  • [2] Non-intrusive Load Monitoring Method Based on Multi-task Learning Convolutional Network
    Deng X.
    Chen Z.
    Yang K.
    Liu B.
    Dianli Xitong Zidonghua/Automation of Electric Power Systems, 2023, 47 (08): : 189 - 197
  • [3] A multi-task learning model for non-intrusive load monitoring based on discrete wavelet transform
    Luo, Jie
    Liu, Shubo
    Cai, Zhaohui
    Xiong, Chang
    Tu, Guoqing
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (08): : 9021 - 9046
  • [4] A multi-task learning model for non-intrusive load monitoring based on discrete wavelet transform
    Jie Luo
    Shubo Liu
    Zhaohui Cai
    Chang Xiong
    Guoqing Tu
    The Journal of Supercomputing, 2023, 79 : 9021 - 9046
  • [5] Non-intrusive speech quality assessment: A survey
    Shen, Kailai
    Yan, Diqun
    Hu, Jing
    Ye, Zhe
    NEUROCOMPUTING, 2024, 580
  • [6] INTRUSIVE AND NON-INTRUSIVE PERCEPTUAL SPEECH QUALITY ASSESSMENT USING A CONVOLUTIONAL NEURAL NETWORK
    Gamper, Hannes
    Reddy, Chandan K. A.
    Cutler, Ross
    Tashev, Ivan J.
    Gehrke, Johannes
    2019 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2019, : 85 - 89
  • [7] Parametric-based non-intrusive speech quality assessment by deep neural network
    Yang, Haemin
    Byun, Kyungguen
    Kang, Hong-Goo
    Kwak, Youngsu
    2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2016, : 99 - 103
  • [8] A Multi-Task Deep Learning Approach for Non-Intrusive Load Monitoring of Multiple Appliances
    Dash, Suryalok
    Sahoo, N. C.
    IEEE TRANSACTIONS ON SMART GRID, 2024, 15 (03) : 3337 - 3340
  • [9] MTFed-NILM: Multi-Task Federated Learning for Non-Intrusive Load Monitoring
    Wang, Xiyue
    Li, Wei
    2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 63 - 70
  • [10] Perceptual model for non-intrusive speech quality assessment
    Kim, DS
    Tarraf, A
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 1060 - 1063