BAM! Born-Again Multi-Task Networks for Natural Language Understanding

被引:0
|
作者
Clark, Kevin [1 ]
Minh-Thang Luong [2 ]
Khandelwal, Urvashi [1 ]
Manning, Christopher D. [1 ]
Le, Quoc V. [2 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] Google Brain, Mountain View, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It can be challenging to train multi-task neural networks that outperform or even match their single-task counterparts. To help address this, we propose using knowledge distillation where single-task models teach a multi-task model. We enhance this training with teacher annealing, a novel method that gradually transitions the model from distillation to supervised learning, helping the multi-task model surpass its single-task teachers. We evaluate our approach by multi-task fine-tuning BERT on the GLUE benchmark. Our method consistently improves over standard single-task and multi-task training.
引用
收藏
页码:5931 / 5937
页数:7
相关论文
共 50 条
  • [1] Multi-Task Deep Neural Networks for Natural Language Understanding
    Liu, Xiaodong
    He, Pengcheng
    Chen, Weizhu
    Gao, Jianfeng
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4487 - 4496
  • [2] The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
    Liu, Xiaodong
    Wang, Yu
    Ji, Jianshu
    Cheng, Hao
    Zhu, Xueyun
    Awa, Emmanuel
    He, Pengcheng
    Chen, Weizhu
    Poon, Hoifung
    Cao, Guihong
    Gao, Jianfeng
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): SYSTEM DEMONSTRATIONS, 2020, : 118 - 126
  • [3] Born-Again Neural Networks
    Furlanello, Tommaso
    Lipton, Zachary C.
    Tschannen, Michael
    Itti, Laurent
    Anandkumar, Anima
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [4] BORN-AGAIN CATALOGING IN THE ONLINE NETWORKS
    HAFTER, R
    COLLEGE & RESEARCH LIBRARIES, 1986, 47 (04): : 360 - 364
  • [5] Hierarchical and Bidirectional Joint Multi-Task Classifiers for Natural Language Understanding
    Ji, Xiaoyu
    Hu, Wanyang
    Liang, Yanyan
    MATHEMATICS, 2023, 11 (24)
  • [6] Bidirectional Transformer Based Multi-Task Learning for Natural Language Understanding
    Tripathi, Suraj
    Singh, Chirag
    Kumar, Abhay
    Pandey, Chandan
    Jain, Nishant
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 54 - 65
  • [7] Macular: a Multi-Task Adversarial Framework for Cross-Lingual Natural Language Understanding
    Wang, Haoyu
    Wang, Yaqing
    Wu, Feijie
    Xue, Hongfei
    Gao, Jing
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 5061 - 5070
  • [8] Multi-task learning approach for utilizing temporal relations in natural language understanding tasks
    Lim, Chae-Gyun
    Jeong, Young-Seob
    Choi, Ho-Jin
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [9] Multi-task learning approach for utilizing temporal relations in natural language understanding tasks
    Chae-Gyun Lim
    Young-Seob Jeong
    Ho-Jin Choi
    Scientific Reports, 13
  • [10] UAV Path Planning in Multi-Task Environments with Risks through Natural Language Understanding
    Wang, Chang
    Zhong, Zhiwei
    Xiang, Xiaojia
    Zhu, Yi
    Wu, Lizhen
    Yin, Dong
    Li, Jie
    DRONES, 2023, 7 (03)