HOTPOTQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

被引:0
|
作者
Yang, Zhilin [1 ]
Peng, Qi [2 ]
Zhang, Saizheng [3 ]
Bengiov, Yoshua [3 ,4 ]
Cohent, William W. [5 ]
Salakhutdinov, Ruslan [1 ]
Manning, Christopher D. [2 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Stanford Univ, Stanford, CA 94305 USA
[3] Univ Montreal, Mila, Montreal, PQ, Canada
[4] CIFAR, Rome, Italy
[5] Google AI, Mountain View, CA USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing question answering (QA) datasets fail to train QA systems to perform complex reasoning and provide explanations for answers. We introduce HOTPOTQA, a new dataset with 113k Wikipedia-based question-answer pairs with four key features: (1) the questions require finding and reasoning over multiple supporting documents to answer; (2) the questions are diverse and not constrained to any pre-existing knowledge bases or knowledge schemas; (3) we provide sentence-level supporting facts required for reasoning, allowing QA systems to reason with strong supervision and explain the predictions; (4) we offer a new type of factoid comparison questions to test QA systems' ability to extract relevant facts and perform necessary comparison. We show that HOTPOTQA is challenging for the latest QA systems, and the supporting facts enable models to improve performance and make explainable predictions.
引用
收藏
页码:2369 / 2380
页数:12
相关论文
共 50 条
  • [1] VIMQA: A Vietnamese Dataset for Advanced Reasoning and Explainable Multi-hop Question Answering
    Le, Nguyen-Khang
    Nguyen, Dieu-Hien
    Le, Tung
    Nguyen, Minh Le
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6521 - 6529
  • [2] Multi-hop Question Answering
    Mavi, Vaibhav
    Jangra, Anubhav
    Jatowt, Adam
    FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, 2023, 17 (05): : 457 - 586
  • [3] Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning
    Li, Ruosen
    Du, Xinya
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6779 - 6789
  • [4] Two-Phase Semantic Retrieval for Explainable Multi-Hop Question Answering
    Wang, Qin
    Feng, Jianzhou
    Xu, Ganlin
    Huang, Lei
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 452 - 465
  • [5] HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data
    Chen, Wenhu
    Zha, Hanwen
    Chen, Zhiyu
    Xiong, Wenhan
    Wang, Hong
    Wang, William
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1026 - 1036
  • [6] Unsupervised Multi-hop Question Answering by Question Generation
    Pan, Liangming
    Chen, Wenhu
    Xiong, Wenhan
    Kan, Min-Yen
    Wang, William Yang
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5866 - 5880
  • [7] Dynamic Semantic Graph Construction and Reasoning for Explainable Multi-hop Science Question Answering
    Xu, Weiwen
    Zhang, Huihui
    Cai, Deng
    Lam, Wai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1044 - 1056
  • [8] Question Calibration and Multi-Hop Modeling for Temporal Question Answering
    Xue, Chao
    Liang, Di
    Wang, Pengfei
    Zhang, Jing
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19332 - 19340
  • [9] Ask to Understand: Question Generation for Multi-hop Question Answering
    Li, Jiawei
    Ren, Mucheng
    Gao, Yang
    Yang, Yizhe
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 19 - 36
  • [10] Hierarchical Graph Network for Multi-hop Question Answering
    Fang, Yuwei
    Sun, Siqi
    Gan, Zhe
    Pillai, Rohit
    Wang, Shuohang
    Liu, Jingjing
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8823 - 8838