Task-Agnostic Structured Pruning of Speech Representation Models

被引：1

作者：

Wang, Haoyu ^{[1
]}

Wang, Siyuan ^{[1
]}

Zhang, Wei-Qiang ^{[1
]}

Suo, Hongbin ^{[2
]}

Wan, Yulong ^{[2
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China

[2] OPPO, Data & AI Engn Syst, Beijing 100026, Peoples R China

来源：

INTERSPEECH 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

Model pruning; knowledge distillation; model compression; representation learning;

D O I：

10.21437/Interspeech.2023-1442

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM have been shown to significantly improve many speech tasks. However, their large memory and strong computational requirements hinder their industrial applicability. Structured pruning is a hardware-friendly model compression technique but usually results in a larger loss of accuracy. In this paper, we propose a fine-grained attention head pruning method to compensate for the performance degradation. In addition, we also introduce the straight through estimator into the L0 regularization to further accelerate the pruned model. Experiments on the SUPERB benchmark show that our model can achieve comparable performance to the dense model in multiple tasks and outperforms the Wav2vec 2.0 base model on average, with 72% fewer parameters and 2 times faster inference speed.

引用

页码：231 / 235

页数：5

共 50 条

[31] Latent Plans for Task-Agnostic Offline Reinforcement Learning
Rosete-Beas, Erick
Mees, Oier
Kalweit, Gabriel
Boedecker, Joschka
Burgard, Wolfram
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1838 - 1849
[32] Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
Du, Yilun
Narasimhan, Karthik
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[33] Task-agnostic feature extractors for incremental learning at the edge
Loomis, Lisa
Wise, David
Inkawhich, Nathan
Thiem, Clare
McDonald, Nathan
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS VI, 2024, 13051
[34] Task-Agnostic Amortized Inference of Gaussian Process Hyperparameters
Liu, Sulin
Sun, Xingyuan
Ramadge, Peter J.
Adams, Ryan P.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[35] TADA: Efficient Task-Agnostic Domain Adaptation for Transformers
Hung, Chia-Chien
Lange, Lukas
Stroetgen, Jannik
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 487 - 503
[36] COSMIC: Mutual Information for Task-Agnostic Summarization Evaluation
Darrin, Maxime
Formont, Philippe
CilEuNG, Jackie Chi Kit
Piantanida, Pablo
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 12696 - 12717
[37] Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Xu, Dongkuan
Mukherjee, Subhabrata
Liu, Xiaodong
Dey, Debadeepta
Wang, Wenhui
Zhang, Xiang
Awadallah, Ahmed Hassan
Gao, Jianfeng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[38] CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration
Yang, Qisong
Spaan, Matthijs T. J.
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10798 - 10806
[39] Task-Agnostic Continual Hippocampus Segmentation for Smooth Population Shifts
Gonzalez, Camila
Ranem, Amin
Othman, Ahmed
Mukhopadhyay, Anirban
DOMAIN ADAPTATION AND REPRESENTATION TRANSFER (DART 2022), 2022, 13542 : 108 - 118
[40] FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling
Lu, Hao
Liu, Wenze
Fu, Hongtao
Cao, Zhiguo
COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 : 231 - 247

← 1 2 3 4 5 →