Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks

被引：0

作者：

Chang, Kai-Wei ^{[1
]}

Hsu, Ming-Hao ^{[2
]}

Li, Shan-Wen ^{[3
]}

Lee, Hung-yi ^{[2
]}

机构：

[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan

[2] Natl Taiwan Univ, Dept Elect Engn, Taipei, Taiwan

[3] Meta AI, Menlo Pk, CA USA

来源：

INTERSPEECH 2024 | 2024年

关键词：

In-context learning; speech language model; prompt tuning; few-shot learning; speech classification;

D O I：

10.21437/Interspeech.2024-1932

中图分类号：

学科分类号：

摘要：

Ever since the development of GPT-3 in the natural language processing (NLP) field, in-context learning (ICL) has played an essential role in utilizing large language models (LLMs). By presenting the LM utterance-label demonstrations at the input, the LM can accomplish few-shot learning without relying on gradient descent or requiring explicit modification of its parameters. This enables the LM to perform various downstream tasks in a black-box manner. Despite the success of ICL in NLP, little work is exploring the possibility of ICL in speech processing. This study is the first work exploring ICL for speech classification tasks with textless speech LM. We first show that the current speech LM lacks the ICL capability. We then perform warmup training on the speech LM, equipping the LM with demonstration learning capability. This paper explores and proposes the first speech LM capable of performing unseen classification tasks in an ICL manner.

引用

页码：4139 / 4143

页数：5

共 50 条

[1] COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Pan, Jing
Wu, Jian
Gaur, Yashesh
Sivasankaran, Sunit
Chen, Zhuo
Liu, Shujie
Li, Jinyu
INTERSPEECH 2024, 2024, : 4164 - 4168
[2] Customizing Language Model Responses with Contrastive In-Context Learning
Gao, Xiang
Das, Kamalika
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18039 - 18046
[3] SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Chen, Yi-Syuan
Song, Yun-Zhu
Yeo, Cheng Yu
Liu, Bei
Fu, Jianlong
Shuai, Hong-Han
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15384 - 15396
[4] Meta-learning via Language Model In-context Tuning
Chen, Yanda
Zhong, Ruiqi
Zha, Sheng
Karypis, George
He, He
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 719 - 730
[5] IN-CONTEXT LANGUAGE LEARNING: ARCHITECTURES AND ALGORITHMS
Akyürek, Ekin
Wang, Bailin
Kim, Yoon
Andreas, Jacob
arXiv,
[6] Exploring Conditional Language Model Based Data Augmentation Approaches for Hate Speech Classification
D'Sa, Ashwin Geet
Illina, Irina
Fohr, Dominique
Klakow, Dietrich
Ruiter, Dana
TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 135 - 146
[7] SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Chang, Kai-Wei
Wu, Haibin
Wang, Yu-Kai
Wu, Yuan-Kuei
Shen, Hua
Tseng, Wei-Cheng
Kang, Iu-Thing
Li, Shang-Wen
Lee, Hung-Yi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3730 - 3744
[8] Meta In-Context Learning: Harnessing Large Language Models for Electrical Data Classification
Zhou, Mi
Li, Fusheng
Zhang, Fan
Zheng, Junhao
Ma, Qianli
ENERGIES, 2023, 16 (18)
[9] Learning a Language Model from Continuous Speech
Neubig, Graham
Mimura, Masato
Mori, Shinsuke
Kawahara, Tatsuya
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1053 - 1056
[10] Playing speech backwards for classification tasks
Hürst, W
Lauer, T
Bürfent, C
2005 IEEE International Conference on Multimedia and Expo (ICME), Vols 1 and 2, 2005, : 913 - 916

← 1 2 3 4 5 →