COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning

被引：0

作者：

Pan, Jing ^{[1
]}

Wu, Jian ^{[1
]}

Gaur, Yashesh ^{[1
]}

Sivasankaran, Sunit ^{[1
]}

Chen, Zhuo ^{[1
]}

Liu, Shujie ^{[1
]}

Li, Jinyu ^{[1
]}

机构：

[1] Microsoft, One Microsoft Way, Redmond, WA 98052 USA

来源：

INTERSPEECH 2024 | 2024年

关键词：

multi modality; large language model; speechin-context learning; instruction tuning;

D O I：

10.21437/Interspeech.2024-1346

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a cost-effective method to integrate speech into a large language model (LLM), resulting in a Contextual Speech Model with Instruction-following/in-context-learning Capabilities (COSMIC) multi-modal LLM. Using GPT-3.5, we generate Speech Comprehension Test Question-Answer (SQA) pairs from speech transcriptions for supervised instruction tuning. With under 30 million trainable parameters and only 450 hours of English speech data, COSMIC demonstrates emerging capabilities in instruction-following and in-context learning. Equipped with such capabilities, COSMIC achieves a maximum 33.18 BLEU score in 0-shot EN-to-X speech to text translation (S2TT) and a significant boost in the 1-shot setting. Additionally, there is an average 25.8% relative Word Error Rate (WER) reduction for 1-shot cross-domain adaptation. COSMIC exhibits a significant automatic speech recognition (ASR) accuracy gain in contextual biasing tasks due to its instruction-following capability.

引用

页码：4164 / 4168

页数：5

共 43 条

[21] FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning
Ye, Qinyuan
Beltagy, Iz
Peters, Matthew E.
Ren, Xiang
Hajishirzi, Hannaneh
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 8158 - 8185
[22] Meta In-Context Learning: Harnessing Large Language Models for Electrical Data Classification
Zhou, Mi
Li, Fusheng
Zhang, Fan
Zheng, Junhao
Ma, Qianli
ENERGIES, 2023, 16 (18)
[23] Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving
Yigit G.
Amasyali M.F.
SN Computer Science, 5 (5)
[24] Concept-aware Data Construction Improves In-context Learning of Language Models
Stefanik, Michal
Kadlcik, Marek
Sojka, Petr
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 12335 - 12352
[25] Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation
Mosbach, Marius
Pimentel, Tiago
Ravfogel, Shauli
Klakow, Dietrich
Elazar, Yanai
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12284 - 12314
[26] LaiDA: Linguistics-Aware In-Context Learning with Data Augmentation for Metaphor Components Identification
Liu, Hongde
He, Chenyuan
Meng, Feiyang
Niu, Changyong
Jia, Yuxiang
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT V, NLPCC 2024, 2025, 15363 : 287 - 299
[27] Transformer-based in-context policy learning for efficient active flow control across various airfoils
Zheng, Changdong
Xie, Fangfang
Ji, Tingwei
Zhou, Hongjie
Zheng, Yao
JOURNAL OF FLUID MECHANICS, 2024, 1001
[28] Self-Tuning for Data-Efficient Deep Learning
Wang, Ximei
Gao, Jinghan
Long, Mingsheng
Wang, Jianmin
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7748 - 7759
[29] Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks
Chatterjee, Anwoy
Tanwar, Eshaan
Dutta, Subhabrata
Chakraborty, Tanmoy
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 11568 - 11587
[30] A multi-granularity in-context learning method for few-shot Named Entity Recognition via Knowledgeable Parameters Fine-tuning
Zhao, Qihui
Gao, Tianhan
Guo, Nan
INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (04)

← 1 2 3 4 5 →