Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking

被引:0
|
作者
Chen, Derek [1 ]
Qian, Kun [1 ]
Yu, Zhou [1 ]
机构
[1] Columbia Univ, Dialogue NLP Lab, New York, NY 10027 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompt-based methods with large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. These models improve even further with the addition of a few labeled in-context exemplars to guide output generation. However, for more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial, leading to unstable results. Furthermore, building incontext exemplars for dialogue tasks is difficult because conversational contexts are long while model input lengths are relatively short. To overcome these issues we first adapt a meta-learning scheme to the dialogue domain which stabilizes the ability of the model to perform well under various prompts. We additionally design a novel training method to improve upon vanilla retrieval mechanisms to find ideal in-context examples. Finally, we introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query. In effect, we are able to achieve highly competitive results for few-shot DST on MultiWOZ.
引用
收藏
页码:1551 / 1564
页数:14
相关论文
共 50 条
  • [21] Diverse Retrieval-Augmented In-Context Learning for Dialogue State Tracking
    King, Brendan
    Flanigan, Jeffrey
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5570 - 5583
  • [22] Making Pre-trained Language Models Better Learn Few-Shot Spoken Language Understanding in More Practical Scenarios
    Wang, Yufan
    Jie, Mei
    Zou, Bowei
    Fan, Rui
    He, Tingting
    Aw, Ai Ti
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 13508 - 13523
  • [23] An Investigation of Suitability of Pre-Trained Language Models for Dialogue Generation - Avoiding Discrepancies
    Zeng, Yan
    Nie, Jian-Yun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4481 - 4494
  • [24] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    ENGINEERING, 2023, 25 : 51 - 65
  • [25] Meta Distant Transfer Learning for Pre-trained Language Models
    Wang, Chengyu
    Pan, Haojie
    Qiu, Minghui
    Yang, Fei
    Huang, Jun
    Zhang, Yin
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9742 - 9752
  • [26] Zero-shot domain paraphrase with unaligned pre-trained language models
    Zheng Chen
    Hu Yuan
    Jiankun Ren
    Complex & Intelligent Systems, 2023, 9 : 1097 - 1110
  • [27] Zero-shot domain paraphrase with unaligned pre-trained language models
    Chen, Zheng
    Yuan, Hu
    Ren, Jiankun
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (01) : 1097 - 1110
  • [28] Pre-trained Language Models Can be Fully Zero-Shot Learners
    Zhao, Xuandong
    Ouyang, Siqi
    Yu, Zhiguo
    Wu, Ming
    Li, Lei
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15590 - 15606
  • [29] Few Shot Dialogue State Tracking using Meta-learning
    Dingliwal, Saket
    Gao, Bill
    Agarwal, Sanchit
    Lin, Chien-Wei
    Chung, Tagyoung
    Hakkani-Tur, Dilek
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1730 - 1739
  • [30] Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization
    Zhang, Haode
    Liang, Haowen
    Zhang, Yuwei
    Zhan, Liming
    Wu, Xiao-Ming
    Lu, Xiaolei
    Lam, Albert Y. S.
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 532 - 542