Span Selection Pre-training for Question Answering

被引:0
|
作者
Glass, Michael [1 ]
Gliozzo, Alfio [1 ]
Chakravarti, Rishav [1 ]
Ferritto, Anthony [1 ]
Pan, Lin [1 ]
Bhargav, G. P. Shrivatsa [2 ]
Garg, Dinesh [1 ]
Sil, Avirup [1 ]
机构
[1] IBM Res AI, Armonk, NY 10504 USA
[2] IISC, Dept CSA, Bangalore, Karnataka, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA). BERT is pre-trained on two auxiliary tasks: Masked Language Model and Next Sentence Prediction. In this paper we introduce a new pre-training task inspired by reading comprehension to better align the pre-training from memorization to understanding. Span Selection Pre-Training (SSPT) poses doze-like training instances, but rather than draw the answer from the model's parameters, it is selected from a relevant passage. We find significant and consistent improvements over both BERTBASE and BERTLARGE on multiple Machine Reading Comprehension (MRC) datasets. Specifically, our proposed model has strong empirical evidence as it obtains SOTA results on Natural Questions, a new benchmark MRC dataset, outperforming BERTLARGE by 3 F1 points on short answer prediction. We also show significant impact in HotpotQA, improving answer prediction F1 by 4 points and supporting fact prediction F1 by 1 point and outperforming the previous best system. Moreover, we show that our pre-training approach is particularly effective when training data is limited, improving the learning curve by a large amount.
引用
收藏
页码:2773 / 2782
页数:10
相关论文
共 50 条
  • [41] Pre-Training to Learn in Context
    Gu, Yuxian
    Dong, Li
    Wei, Furu
    Huang, Minlie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4849 - 4870
  • [42] Improving Fractal Pre-training
    Anderson, Connor
    Farrell, Ryan
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2412 - 2421
  • [43] Pre-training via Paraphrasing
    Lewis, Mike
    Ghazvininejad, Marjan
    Ghosh, Gargi
    Aghajanyan, Armen
    Wang, Sida
    Zettlemoyer, Luke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [44] Pre-training phenotyping classifiers
    Dligach, Dmitriy
    Afshar, Majid
    Miller, Timothy
    JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 113 (113)
  • [45] PERM: Pre-training Question Embeddings via Relation Map for Improving Knowledge Tracing
    Wang, Wentao
    Ma, Huifang
    Zhao, Yan
    Yang, Fanyi
    Chang, Liang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 281 - 288
  • [46] SEEP: Semantic-enhanced question embeddings pre-training for improving knowledge tracing
    Wang, Wentao
    Ma, Huifang
    Zhao, Yan
    Yang, Fanyi
    Chang, Liang
    INFORMATION SCIENCES, 2022, 614 : 153 - 169
  • [47] SEEP: Semantic-enhanced question embeddings pre-training for improving knowledge tracing
    Wang, Wentao
    Ma, Huifang
    Zhao, Yan
    Yang, Fanyi
    Chang, Liang
    INFORMATION SCIENCES, 2022, 614 : 153 - 169
  • [48] Rethinking Pre-training and Self-training
    Zoph, Barret
    Ghiasi, Golnaz
    Lin, Tsung-Yi
    Cui, Yin
    Liu, Hanxiao
    Cubuk, Ekin D.
    Le, Quoc V.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [49] Pre-Training Model and Client Selection Optimization for Improving Federated Learning Efficiency
    Ge, Bingchen
    Zhou, Ying
    Xie, Liping
    Kou, Lirong
    2024 9TH INTERNATIONAL CONFERENCE ON ELECTRONIC TECHNOLOGY AND INFORMATION SCIENCE, ICETIS 2024, 2024, : 650 - 660
  • [50] Question Condensing Networks for Answer Selection in Community Question Answering
    Wu, Wei
    Sun, Xu
    Wang, Houfeng
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1746 - 1755