Cooperative Self-training of Machine Reading Comprehension

被引:0
|
作者
Luo, Hongyin [1 ]
Li, Shang-Wen [2 ]
Gao, Mingye [3 ]
Yu, Seunghak [2 ]
Glass, James [1 ]
机构
[1] MIT CSAIL, Cambridge, MA 02139 USA
[2] Amazon AI, Bellevue, WA USA
[3] MIT MTL, Cambridge, MA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pretrained language models have significantly improved the performance of downstream language understanding tasks, including extractive question answering, by providing high-quality contextualized word embeddings. However, training question answering models still requires large amounts of annotated data for specific domains. In this work, we propose a cooperative self-training framework, RGX, for automatically generating more non-trivial question-answer pairs to improve model performance. RGX is built upon a masked answer extraction task with an interactive learning environment containing an answer entity Recognizer, a question Generator, and an answer eXtractor. Given a passage with a masked entity, the generator generates a question around the entity, and the extractor is trained to extract the masked entity with the generated question and raw texts. The framework allows the training of question generation and answering models on any text corpora without annotation. We further leverage a self-training technique to improve the performance of both question generation and answer extraction models. Experiment results show that RGX outperforms the state-of-the-art (SOTA) pretrained language models and transfer learning approaches on standard question-answering benchmarks, and yields the new SOTA performance under given model size and transfer learning settings.
引用
收藏
页码:244 / 257
页数:14
相关论文
共 50 条
  • [41] Self-Training of ESD for Experienced Endoscopists
    Takahashi, Morio
    Katayama, Yasumi
    GASTROINTESTINAL ENDOSCOPY, 2012, 75 (04) : 373 - 373
  • [42] A Unified Contrastive Loss for Self-training
    Gauffre, Aurelien
    Horvat, Julien
    Amini, Massih-Reza
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-RESEARCH TRACK AND DEMO TRACK, PT VIII, ECML PKDD 2024, 2024, 14948 : 3 - 18
  • [43] An approach to mobile robot self-training
    Golovko, V
    Ignatiuk, O
    Sauta, V
    PROCEEDINGS OF THE IEEE INTELLIGENT VEHICLES SYMPOSIUM 2000, 2000, : 608 - 613
  • [44] Self-Training with Selection-by-Rejection
    Zhou, Yan
    Kantarcioglu, Murat
    Thuraisingham, Bhavani
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 795 - 803
  • [45] PATTERN RECOGNITION IN SELF-TRAINING MODE
    LAPTEV, VA
    MILENKIY, AV
    ENGINEERING CYBERNETICS, 1966, (06): : 104 - &
  • [46] A Survey on Machine Reading Comprehension Systems
    Baradaran, Razieh
    Ghiasi, Razieh
    Amirkhani, Hossein
    NATURAL LANGUAGE ENGINEERING, 2022, 28 (06) : 683 - 732
  • [47] Event Extraction as Machine Reading Comprehension
    Liu, Jian
    Chen, Yubo
    Liu, Kang
    Bi, Wei
    Liu, Xiaojiang
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1641 - 1651
  • [48] Improving Machine Reading Comprehension with General Reading Strategies
    Sun, Kai
    Yu, Dian
    Yu, Dong
    Cardie, Claire
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2633 - 2643
  • [49] Machine Reading Comprehension: Matching and Orders
    Liu, Ao
    Qu, Lizhen
    Lu, Junyu
    Zhang, Chenbin
    Xu, Zenglin
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2057 - 2060
  • [50] Robustness of Chinese Machine Reading Comprehension
    Li Y.
    Tang H.
    Qian J.
    Zou B.
    Hong Y.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2021, 57 (01): : 16 - 22