NON-INTRUSIVE SPEECH QUALITY ASSESSMENT WITH MULTI-TASK LEARNING BASED ON TENSOR NETWORK

被引:1
|
作者
Liu, Hanyue [1 ]
Liu, Miao [1 ]
Wang, Jing [1 ]
Xie, Xiang [1 ]
Yang, Lidong [2 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] Inner Mongolia Univ Sci & Technol, Baotou, Peoples R China
关键词
speech quality assessment; tensor network; matrix product state; multi-task learning;
D O I
10.1109/ICASSP48485.2024.10447695
中图分类号
学科分类号
摘要
With the growing significance of non-intrusive speech quality assessment in speech systems, existing methods predominantly rely on neural networks to extract low-order features. Typically, these features undergo a low-dimensional linear transformation, yielding the network's output. However, the intercorrelation between feature points is often overlooked. In this paper, we explore the concept of kernel method, which maps features into high dimensional space through dot product, in order to enhance the extraction of relationships among all feature points. Considering the unique advantages of tensors in complex data representation, we extend the utilization of tensor network and propose a novel framework that incorporates a matrix product state (MPS) layer to predict mean opinion score (MOS). By integrating the MPS layer, our model can transform low-order features into higher-order representations, facilitating linear transformation in a high dimensional space without increasing the number of parameters. Furthermore, we propose a loss function that concurrently assesses regression and classification biases, along with correlation with real MOS labels. Experimental results demonstrate that our proposed model consistently outperforms the baseline system across all evaluation metrics and surpasses state-of-the-art models on the test set.
引用
收藏
页码:851 / 855
页数:5
相关论文
共 50 条
  • [21] Non-intrusive Speech Quality Assessment with Support Vector Regression
    Narwaria, Manish
    Lin, Weisi
    McLoughlin, Ian Vince
    Emmanuel, Sabu
    Tien, Chia Liang
    ADVANCES IN MULTIMEDIA MODELING, PROCEEDINGS, 2010, 5916 : 325 - 335
  • [22] NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS
    Avila, Anderson R.
    Gamper, Hannes
    Reddy, Chandan
    Cutler, Ross
    Tashev, Ivan
    Gehrke, Johannes
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 631 - 635
  • [23] Enhanced perceptual model for non-intrusive speech quality assessment
    Kim, Doh-Suk
    Tarraf, Ahmed
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 829 - 832
  • [24] NON-INTRUSIVE SPEECH INTELLIGIBILITY ASSESSMENT
    Sharma, Dushyant
    Naylor, Patrick A.
    Brookes, Mike
    2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
  • [25] Non-Intrusive Speech Quality Assessment with Transfer Learning and Subject-specific Scaling
    Nessler, Natalia
    Cernak, Milos
    Prandoni, Paolo
    Mainar, Pablo
    INTERSPEECH 2021, 2021, : 2406 - 2410
  • [26] A Deep Learning-Based Time-Domain Approach for Non-Intrusive Speech Quality Assessment
    Jia, Xupeng
    Li, Dongmei
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 477 - 481
  • [27] Meta-reinforcement learning based few-shot speech reconstruction for non-intrusive speech quality assessment
    Zhou, Weili
    Lai, Jinxiong
    Liao, Yuetao
    Ji, Ruijie
    APPLIED INTELLIGENCE, 2023, 53 (11) : 14146 - 14161
  • [28] Non-Intrusive Speech Quality Assessment with Low Computational Complexity
    Grancharov, Volodya
    Zhao, David Y.
    Lindblom, Jonas
    Kleijn, W. Bastiaan
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 189 - 192
  • [29] N-MTTL SI Model: Non-intrusive Multi-task Transfer Learning-Based Speech Intelligibility Prediction Model with Scenery Classification
    Marcinek, Lubos
    Stone, Michael
    Millman, Rebecca
    Gaydecki, Patrick
    INTERSPEECH 2021, 2021, : 3365 - 3369
  • [30] Meta-reinforcement learning based few-shot speech reconstruction for non-intrusive speech quality assessment
    Weili Zhou
    Jinxiong Lai
    Yuetao Liao
    Ruijie Ji
    Applied Intelligence, 2023, 53 : 14146 - 14161