BAM! Born-Again Multi-Task Networks for Natural Language Understanding

被引:0
|
作者
Clark, Kevin [1 ]
Minh-Thang Luong [2 ]
Khandelwal, Urvashi [1 ]
Manning, Christopher D. [1 ]
Le, Quoc V. [2 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] Google Brain, Mountain View, CA USA
来源
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019) | 2019年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It can be challenging to train multi-task neural networks that outperform or even match their single-task counterparts. To help address this, we propose using knowledge distillation where single-task models teach a multi-task model. We enhance this training with teacher annealing, a novel method that gradually transitions the model from distillation to supervised learning, helping the multi-task model surpass its single-task teachers. We evaluate our approach by multi-task fine-tuning BERT on the GLUE benchmark. Our method consistently improves over standard single-task and multi-task training.
引用
收藏
页码:5931 / 5937
页数:7
相关论文
共 50 条
  • [21] Multi-task learning for Joint Language Understanding and Dialogue State Tracking
    Rastogi, Abhinav
    Gupta, Raghav
    Hakkani-Tur, Dilek
    19TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2018), 2018, : 376 - 384
  • [22] Understanding the Rise of Born-Again Catholics in the United States: The Role of Educational Attainment
    Perry, Samuel L.
    Schleifer, Cyrus
    REVIEW OF RELIGIOUS RESEARCH, 2018, 60 (04) : 555 - 574
  • [23] A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction
    Hwang, Wonseok
    Lee, Dongjun
    Cho, Kyoungyeon
    Lee, Hanuhl
    Seo, Minjoon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [24] Understanding Multi-Task Schedulability in Duty-Cycling Sensor Networks
    Li, Mo
    Li, Zhenjiang
    Shangguan, Longfei
    Tang, Shaojie
    Li, Xiang-Yang
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (09) : 2463 - 2474
  • [25] Language Modelling as a Multi-Task Problem
    Weber, Lucas
    Jumelet, Jaap
    Bruni, Elia
    Hupkes, Dieuwke
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2049 - 2060
  • [26] Cross-Lingual Multi-Task Neural Architecture for Spoken Language Understanding
    Li, Yu-Jiang
    Zhao, Xuemin
    Xu, Weiqun
    Yan, Yonghong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 566 - 570
  • [27] Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods
    Zhang, Zhihan
    Yu, Wenhao
    Yu, Mengxia
    Guo, Zhichun
    Jiang, Meng
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 943 - 956
  • [28] An enhanced variant effect predictor based on a deep generative model and the Born-Again Networks
    Ha Young Kim
    Woosung Jeon
    Dongsup Kim
    Scientific Reports, 11
  • [29] MVP: Multi-task Supervised Pre-training for Natural Language Generation
    Tang, Tianyi
    Li, Junyi
    Zhao, Wayne Xin
    Wen, Ji-Rong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8758 - 8794
  • [30] Multi-Task Learning with Neural Networks for Voice Query Understanding on an Entertainment Platform
    Rao, Jinfeng
    Ture, Ferhan
    Lin, Jimmy
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 636 - 645