Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks

被引:0
|
作者
Chang, Kai-Wei [1 ]
Hsu, Ming-Hao [2 ]
Li, Shan-Wen [3 ]
Lee, Hung-yi [2 ]
机构
[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan
[2] Natl Taiwan Univ, Dept Elect Engn, Taipei, Taiwan
[3] Meta AI, Menlo Pk, CA USA
来源
INTERSPEECH 2024 | 2024年
关键词
In-context learning; speech language model; prompt tuning; few-shot learning; speech classification;
D O I
10.21437/Interspeech.2024-1932
中图分类号
学科分类号
摘要
Ever since the development of GPT-3 in the natural language processing (NLP) field, in-context learning (ICL) has played an essential role in utilizing large language models (LLMs). By presenting the LM utterance-label demonstrations at the input, the LM can accomplish few-shot learning without relying on gradient descent or requiring explicit modification of its parameters. This enables the LM to perform various downstream tasks in a black-box manner. Despite the success of ICL in NLP, little work is exploring the possibility of ICL in speech processing. This study is the first work exploring ICL for speech classification tasks with textless speech LM. We first show that the current speech LM lacks the ICL capability. We then perform warmup training on the speech LM, equipping the LM with demonstration learning capability. This paper explores and proposes the first speech LM capable of performing unseen classification tasks in an ICL manner.
引用
收藏
页码:4139 / 4143
页数:5
相关论文
共 50 条
  • [41] UNSUPERVISED CONTEXT LEARNING FOR SPEECH RECOGNITION
    Michaely, Assaf Hurwitz
    Ghodsi, Mohammadreza
    Wu, Zelin
    Scheiner, Justin
    Aleksic, Petar
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 447 - 453
  • [42] Exploring the speech-language connection.
    Sudhalter, V
    AMERICAN JOURNAL ON MENTAL RETARDATION, 2000, 105 (01): : 61 - 64
  • [43] Context-dependent language of auditory hallucinations in an adolescent learning a second language: A case study applying the inner speech model
    Becker, Timothy
    Hassan, Yonis
    Wenger, Brittany
    Race, Jasmine
    Ashley, Jessica
    Friedman, Stephanie
    Rice, Timothy
    SCHIZOPHRENIA RESEARCH, 2021, 228 : 614 - 615
  • [44] Diluie: constructing diverse demonstrations of in-context learning with large language model for unified information extraction
    Guo Q.
    Guo Y.
    Zhao J.
    Neural Computing and Applications, 2024, 36 (22) : 13491 - 13512
  • [45] MOTHERS SPEECH TO CHILDREN LEARNING LANGUAGE
    SNOW, CE
    CHILD DEVELOPMENT, 1972, 43 (02) : 549 - &
  • [46] Interactive language learning in a speech environment
    Emonts, MW
    Rushforth, M
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4019 - 4019
  • [47] Transfer Learning for Speech and Language Processing
    Wang, Dong
    Zheng, Thomas Fang
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1225 - U2686
  • [48] Language and Gender Classification of Speech Files Using Supervised Machine Learning Methods
    HaCohen-Kerner, Yaakov
    Hagege, Ruben
    CYBERNETICS AND SYSTEMS, 2017, 48 (6-7) : 510 - 535
  • [49] Age of learning and second language speech
    Flege, JE
    SECOND LANGUAGE ACQUISITION AND THE CRITICAL PERIOD HYPOTHESIS, 1999, : 101 - 131
  • [50] Speech Command Classification System for Sinhala Language based on Automatic Speech Recognition
    Dinushika, Thilini
    Kavmini, Lakshika
    Abeyawardhana, Pamoda
    Thayasivam, Uthayasanker
    Jayasena, Sanath
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 205 - 210