Span Selection Pre-training for Question Answering

被引:0
|
作者
Glass, Michael [1 ]
Gliozzo, Alfio [1 ]
Chakravarti, Rishav [1 ]
Ferritto, Anthony [1 ]
Pan, Lin [1 ]
Bhargav, G. P. Shrivatsa [2 ]
Garg, Dinesh [1 ]
Sil, Avirup [1 ]
机构
[1] IBM Res AI, Armonk, NY 10504 USA
[2] IISC, Dept CSA, Bangalore, Karnataka, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA). BERT is pre-trained on two auxiliary tasks: Masked Language Model and Next Sentence Prediction. In this paper we introduce a new pre-training task inspired by reading comprehension to better align the pre-training from memorization to understanding. Span Selection Pre-Training (SSPT) poses doze-like training instances, but rather than draw the answer from the model's parameters, it is selected from a relevant passage. We find significant and consistent improvements over both BERTBASE and BERTLARGE on multiple Machine Reading Comprehension (MRC) datasets. Specifically, our proposed model has strong empirical evidence as it obtains SOTA results on Natural Questions, a new benchmark MRC dataset, outperforming BERTLARGE by 3 F1 points on short answer prediction. We also show significant impact in HotpotQA, improving answer prediction F1 by 4 points and supporting fact prediction F1 by 1 point and outperforming the previous best system. Moreover, we show that our pre-training approach is particularly effective when training data is limited, improving the learning curve by a large amount.
引用
收藏
页码:2773 / 2782
页数:10
相关论文
共 50 条
  • [1] Improving Question Answering by Commonsense-Based Pre-training
    Zhong, Wanjun
    Tang, Duyu
    Duan, Nan
    Zhou, Ming
    Wang, Jiahai
    Yin, Jian
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 16 - 28
  • [2] Design as Desired: Utilizing Visual Question Answering for Multimodal Pre-training
    Su, Tongkun
    Li, Jun
    Zhang, Xi
    Jin, Haibo
    Chen, Hao
    Wang, Qiong
    Lv, Faqin
    Zhao, Baoliang
    Hu, Ying
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IV, 2024, 15004 : 602 - 612
  • [3] Evaluation of Dataset Selection for Pre-Training and Fine-Tuning Transformer Language Models for Clinical Question Answering
    Soni, Sarvesh
    Roberts, Kirk
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5532 - 5538
  • [4] Question Answering Infused Pre-training of General-Purpose Contextualized Representations
    Jia, Robin
    Lewis, Mike
    Zettlemoyer, Luke
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 711 - 728
  • [5] ProQA: Structural Prompt-based Pre-training for Unified Question Answering
    Zhong, Wanjun
    Gao, Yifan
    Ding, Ning
    Qin, Yujia
    Liu, Zhiyuan
    Zhou, Ming
    Wang, Jiahai
    Yin, Jian
    Duan, Nan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4230 - 4243
  • [6] Relation-Guided Pre-Training for Open-Domain Question Answering
    Hu, Ziniu
    Sun, Yizhou
    Chang, Kai-Wei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 3431 - 3448
  • [7] QUESTION ANSWERING SYSTEM BASED ON PRE-TRAINING MODEL AND RETRIEVAL RERANKING FOR INDUSTRY 4.0
    Chen, Ta-Fu
    Lin, Yi-Xing
    Su, Ming-Hsiang
    Chen, Po-Kai
    Tai, Tzu-Chiang
    Wang, Jia-Ching
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2178 - 2181
  • [8] Pre-Training Methods for Question Reranking
    Campese, Stefano
    Lauriola, Ivano
    Moschitti, Alessandro
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 469 - 476
  • [9] THE PRE-TRAINING SELECTION OF TEACHERS
    Barr, A. S.
    Douglas, Lois
    JOURNAL OF EDUCATIONAL RESEARCH, 1934, 28 (02): : 92 - 117
  • [10] Contrastive Pre-training and Representation Distillation for Medical Visual Question Answering Based on Radiology Images
    Liu, Bo
    Zhan, Li-Ming
    Wu, Xiao-Ming
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 210 - 220