Cooperative Self-training of Machine Reading Comprehension

被引:0
|
作者
Luo, Hongyin [1 ]
Li, Shang-Wen [2 ]
Gao, Mingye [3 ]
Yu, Seunghak [2 ]
Glass, James [1 ]
机构
[1] MIT CSAIL, Cambridge, MA 02139 USA
[2] Amazon AI, Bellevue, WA USA
[3] MIT MTL, Cambridge, MA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pretrained language models have significantly improved the performance of downstream language understanding tasks, including extractive question answering, by providing high-quality contextualized word embeddings. However, training question answering models still requires large amounts of annotated data for specific domains. In this work, we propose a cooperative self-training framework, RGX, for automatically generating more non-trivial question-answer pairs to improve model performance. RGX is built upon a masked answer extraction task with an interactive learning environment containing an answer entity Recognizer, a question Generator, and an answer eXtractor. Given a passage with a masked entity, the generator generates a question around the entity, and the extractor is trained to extract the masked entity with the generated question and raw texts. The framework allows the training of question generation and answering models on any text corpora without annotation. We further leverage a self-training technique to improve the performance of both question generation and answer extraction models. Experiment results show that RGX outperforms the state-of-the-art (SOTA) pretrained language models and transfer learning approaches on standard question-answering benchmarks, and yields the new SOTA performance under given model size and transfer learning settings.
引用
收藏
页码:244 / 257
页数:14
相关论文
共 50 条
  • [1] Machine Reading Comprehension Framework Based on Self-Training for Domain Adaptation
    Lee, Hyeon-Gu
    Jang, Youngjin
    Kim, Harksoo
    IEEE Access, 2021, 9 : 21279 - 21285
  • [2] Machine Reading Comprehension Framework Based on Self-Training for Domain Adaptation
    Lee, Hyeon-Gu
    Jang, Youngjin
    Kim, Harksoo
    IEEE ACCESS, 2021, 9 : 21279 - 21285
  • [3] A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction
    Niu, Yilin
    Jiao, Fangkai
    Zhou, Mantong
    Yao, Ting
    Xu, Jingfang
    Huang, Minlie
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3916 - 3927
  • [4] Improving Machine Reading Comprehension with Multi-Task Learning and Self-Training
    Ouyang, Jianquan
    Fu, Mengen
    MATHEMATICS, 2022, 10 (03)
  • [5] A Robust Adversarial Training Approach to Machine Reading Comprehension
    Liu, Kai
    Liu, Xin
    Yang, An
    Liu, Jing
    Su, Jinsong
    Li, Sujian
    She, Qiaoqiao
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8392 - 8400
  • [6] Adversarial Training for Machine Reading Comprehension with Virtual Embeddings
    Yang, Ziqing
    Cui, Yiming
    Si, Chenglei
    Che, Wanxiang
    Liu, Ting
    Wang, Shijin
    Hu, Guoping
    10TH CONFERENCE ON LEXICAL AND COMPUTATIONAL SEMANTICS (SEM 2021), 2021, : 308 - 313
  • [7] Cooperative Self-Training for Multi-Target Adaptive Semantic Segmentation
    Zhang, Yangsong
    Roy, Subhankar
    Lu, Hongtao
    Ricci, Elisa
    Lathuiliere, Stephane
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5593 - 5602
  • [8] Hypothetical Training for Robust Machine Reading Comprehension of Tabular Context
    Li, Moxin
    Wang, Wenjie
    Feng, Fuli
    Zhang, Hanwang
    Wang, Qifan
    Chua, Tat-Seng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1220 - 1236
  • [9] Self-Training for Unsupervised Neural Machine Translation in Unbalanced Training Data Scenarios
    Sun, Haipeng
    Wang, Rui
    Chen, Kehai
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Tiejun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3975 - 3981
  • [10] Self-Training Reinforced Adversarial Adaptation for Machine Fault Diagnosis
    Jiao, Jinyang
    Li, Hao
    Lin, Jing
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (11) : 11649 - 11658