COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning

被引:0
|
作者
Pan, Jing [1 ]
Wu, Jian [1 ]
Gaur, Yashesh [1 ]
Sivasankaran, Sunit [1 ]
Chen, Zhuo [1 ]
Liu, Shujie [1 ]
Li, Jinyu [1 ]
机构
[1] Microsoft, One Microsoft Way, Redmond, WA 98052 USA
来源
关键词
multi modality; large language model; speechin-context learning; instruction tuning;
D O I
10.21437/Interspeech.2024-1346
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a cost-effective method to integrate speech into a large language model (LLM), resulting in a Contextual Speech Model with Instruction-following/in-context-learning Capabilities (COSMIC) multi-modal LLM. Using GPT-3.5, we generate Speech Comprehension Test Question-Answer (SQA) pairs from speech transcriptions for supervised instruction tuning. With under 30 million trainable parameters and only 450 hours of English speech data, COSMIC demonstrates emerging capabilities in instruction-following and in-context learning. Equipped with such capabilities, COSMIC achieves a maximum 33.18 BLEU score in 0-shot EN-to-X speech to text translation (S2TT) and a significant boost in the 1-shot setting. Additionally, there is an average 25.8% relative Word Error Rate (WER) reduction for 1-shot cross-domain adaptation. COSMIC exhibits a significant automatic speech recognition (ASR) accuracy gain in contextual biasing tasks due to its instruction-following capability.
引用
收藏
页码:4164 / 4168
页数:5
相关论文
共 43 条
  • [21] FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning
    Ye, Qinyuan
    Beltagy, Iz
    Peters, Matthew E.
    Ren, Xiang
    Hajishirzi, Hannaneh
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 8158 - 8185
  • [22] Meta In-Context Learning: Harnessing Large Language Models for Electrical Data Classification
    Zhou, Mi
    Li, Fusheng
    Zhang, Fan
    Zheng, Junhao
    Ma, Qianli
    ENERGIES, 2023, 16 (18)
  • [23] Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving
    Yigit G.
    Amasyali M.F.
    SN Computer Science, 5 (5)
  • [24] Concept-aware Data Construction Improves In-context Learning of Language Models
    Stefanik, Michal
    Kadlcik, Marek
    Sojka, Petr
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 12335 - 12352
  • [25] Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation
    Mosbach, Marius
    Pimentel, Tiago
    Ravfogel, Shauli
    Klakow, Dietrich
    Elazar, Yanai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12284 - 12314
  • [26] LaiDA: Linguistics-Aware In-Context Learning with Data Augmentation for Metaphor Components Identification
    Liu, Hongde
    He, Chenyuan
    Meng, Feiyang
    Niu, Changyong
    Jia, Yuxiang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT V, NLPCC 2024, 2025, 15363 : 287 - 299
  • [27] Transformer-based in-context policy learning for efficient active flow control across various airfoils
    Zheng, Changdong
    Xie, Fangfang
    Ji, Tingwei
    Zhou, Hongjie
    Zheng, Yao
    JOURNAL OF FLUID MECHANICS, 2024, 1001
  • [28] Self-Tuning for Data-Efficient Deep Learning
    Wang, Ximei
    Gao, Jinghan
    Long, Mingsheng
    Wang, Jianmin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7748 - 7759
  • [29] Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks
    Chatterjee, Anwoy
    Tanwar, Eshaan
    Dutta, Subhabrata
    Chakraborty, Tanmoy
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 11568 - 11587
  • [30] A multi-granularity in-context learning method for few-shot Named Entity Recognition via Knowledgeable Parameters Fine-tuning
    Zhao, Qihui
    Gao, Tianhan
    Guo, Nan
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (04)