Natural Questions: A Benchmark for Question Answering Research

被引:0
|
作者
Kwiatkowski T. [1 ]
Palomaki J. [1 ]
Redfield O. [1 ]
Collins M. [1 ,2 ]
Parikh A. [1 ]
Alberti C. [1 ]
Epstein D. [1 ]
Polosukhin I. [1 ]
Devlin J. [1 ]
Lee K. [1 ]
Toutanova K. [1 ]
Jones L. [1 ]
Kelcey M. [1 ]
Chang M.-W. [1 ]
Dai A.M. [1 ]
Uszkoreit J. [1 ]
Le Q. [1 ]
Petrov S. [1 ]
机构
[1] Google Research, United States
[2] Columbia University, United States
关键词
D O I
10.1162/tacl_a_00276
中图分类号
学科分类号
摘要
We present the Natural Questions corpus, a question answering data set. Questions consist of real anonymized, aggregated queries issued to the Google search engine. An annotator is presented with a question along with a Wikipedia page from the top 5 search results, and annotates a long answer (typically a paragraph) and a short answer (one or more entities) if present on the page, or marks null if no long/short answer is present. The public release consists of 307,373 training examples with single annotations; 7,830 examples with 5-way annotations for development data; and a further 7,842 examples with 5-way annotated sequestered as test data. We present experiments validating quality of the data. We also describe analysis of 25-way annotations on 302 examples, giving insights into human variability on the annotation task. We introduce robust metrics for the purposes of evaluating question answering systems; demonstrate high human upper bounds on these metrics; and establish baseline results using competitive methods drawn from related literature. © 2019 Association for Computational Linguistics. Distributed under a CC-BY 4.0 license.
引用
收藏
页码:453 / 466
页数:13
相关论文
共 50 条
  • [31] Answering contextual questions based on ontologies and question templates
    Wang, Dongsheng
    FRONTIERS OF COMPUTER SCIENCE IN CHINA, 2011, 5 (04): : 405 - 418
  • [32] Amharic Question Answering for Biography, Definition, and Description Questions
    Abedissa, Tilahun
    Libsie, Mulugeta
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR DEVELOPMENT FOR AFRICA (ICT4DA 2019), 2019, 1026 : 301 - 310
  • [33] Question #72. Answering kids' science questions
    Neuenschwander, DE
    AMERICAN JOURNAL OF PHYSICS, 1998, 66 (04) : 275 - 275
  • [34] Towards Video Text Visual Question Answering: Benchmark and Baseline
    Zhao, Minyi
    Li, Bingjia
    Wang, Jie
    Li, Wanqing
    Zhou, Wenjing
    Zhang, Lan
    Xuyang, Shijie
    Yu, Zhihang
    Yu, Xinkun
    Li, Guangze
    Dai, Aobotao
    Zhou, Shuigeng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [35] BESTMVQA: A Benchmark Evaluation System for Medical Visual Question Answering
    Hong, Xiaojie
    Song, Zixin
    Li, Liangzhi
    Wang, Xiaoli
    Liu, Feiyan
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT IX, ECML PKDD 2024, 2024, 14949 : 435 - 451
  • [36] An Arabic Question-Answering system for factoid questions
    Brini, Wissal
    Ellouze, Mariem
    Mesfar, Slim
    Belguith, Lamia Hadrich
    IEEE NLP-KE 2009: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2009, : 417 - +
  • [37] BBQ: A Hand-Built Bias Benchmark for Question Answering
    Parrish, Alicia
    Chen, Angelica
    Nangia, Nikita
    Padmakumar, Vishakh
    Phang, Jason
    Thompson, Jana
    Phu Mon Htut
    Bowman, Samuel R.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2086 - 2105
  • [38] SelQA: A New Benchmark for Selection-based Question Answering
    Jurczyk, Tomasz
    Zhai, Michael
    Choi, Jinho D.
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 820 - 827
  • [39] Research on question retrieval method for community question answering
    Sun, Yong
    Song, Junfang
    Song, Xiangyu
    Hou, Jiazheng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (16) : 24309 - 24325
  • [40] PSYCHOLOGICAL-RESEARCH ON QUESTION ANSWERING AND QUESTION ASKING
    GRAESSER, AC
    DISCOURSE PROCESSES, 1990, 13 (03) : 259 - 260