Span Selection Pre-training for Question Answering

被引:0
|
作者
Glass, Michael [1 ]
Gliozzo, Alfio [1 ]
Chakravarti, Rishav [1 ]
Ferritto, Anthony [1 ]
Pan, Lin [1 ]
Bhargav, G. P. Shrivatsa [2 ]
Garg, Dinesh [1 ]
Sil, Avirup [1 ]
机构
[1] IBM Res AI, Armonk, NY 10504 USA
[2] IISC, Dept CSA, Bangalore, Karnataka, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA). BERT is pre-trained on two auxiliary tasks: Masked Language Model and Next Sentence Prediction. In this paper we introduce a new pre-training task inspired by reading comprehension to better align the pre-training from memorization to understanding. Span Selection Pre-Training (SSPT) poses doze-like training instances, but rather than draw the answer from the model's parameters, it is selected from a relevant passage. We find significant and consistent improvements over both BERTBASE and BERTLARGE on multiple Machine Reading Comprehension (MRC) datasets. Specifically, our proposed model has strong empirical evidence as it obtains SOTA results on Natural Questions, a new benchmark MRC dataset, outperforming BERTLARGE by 3 F1 points on short answer prediction. We also show significant impact in HotpotQA, improving answer prediction F1 by 4 points and supporting fact prediction F1 by 1 point and outperforming the previous best system. Moreover, we show that our pre-training approach is particularly effective when training data is limited, improving the learning curve by a large amount.
引用
收藏
页码:2773 / 2782
页数:10
相关论文
共 50 条
  • [21] Improving Knowledge Tracing via Pre-training Question Embeddings
    Liu, Yunfei
    Yang, Yang
    Chen, Xianyu
    Shen, Jian
    Zhang, Haifeng
    Yu, Yong
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 1577 - 1583
  • [22] Improving Knowledge Tracing via Pre-training Question Embeddings
    Liu, Yunfei
    Yang, Yang
    Chen, Xianyu
    Shen, Jian
    Zhang, Haifeng
    Yu, Yong
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 1556 - 1562
  • [23] Cross-Modal self-supervised vision language pre-training with multiple objectives for medical visual question answering
    Liu, Gang
    He, Jinlong
    Li, Pengfei
    Zhao, Zixu
    Zhong, Shenjun
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 160
  • [24] Pre-training Entity Relation Encoder with Intra-span and Inter-span Information
    Wang, Yijun
    Sun, Changzhi
    Wu, Yuanbin
    Yan, Junchi
    Gao, Peng
    Xie, Guotong
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1692 - 1705
  • [25] POSPAN: Position-Constrained Span Masking for Language Model Pre-training
    Zhang, Zhenyu
    Shen, Lei
    Zhao, Yuming
    Chen, Meng
    He, Xiaodong
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 4420 - 4424
  • [26] Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
    Zhou, Wangchunshu
    Ge, Tao
    Xu, Canwen
    Xu, Ke
    Wei, Furu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 571 - 582
  • [27] Span-Based Joint Entity and Relation Extraction with Transformer Pre-Training
    Eberts, Markus
    Ulges, Adrian
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2006 - 2013
  • [28] Subset selection for domain adaptive pre-training of language model
    Hwang, Junha
    Lee, Seungdong
    Kim, Haneul
    Jeong, Young-Seob
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [29] STRATEGY SELECTION IN QUESTION ANSWERING
    REDER, LM
    COGNITIVE PSYCHOLOGY, 1987, 19 (01) : 90 - 138
  • [30] CSS: Contrastive Span Selector for Multi-span Question Answering
    Zhao, Wen (zhaowen@pku.edu.cn), 1600, Springer Science and Business Media Deutschland GmbH (14325 LNAI):