Natural Questions: A Benchmark for Question Answering Research

被引:0
|
作者
Kwiatkowski T. [1 ]
Palomaki J. [1 ]
Redfield O. [1 ]
Collins M. [1 ,2 ]
Parikh A. [1 ]
Alberti C. [1 ]
Epstein D. [1 ]
Polosukhin I. [1 ]
Devlin J. [1 ]
Lee K. [1 ]
Toutanova K. [1 ]
Jones L. [1 ]
Kelcey M. [1 ]
Chang M.-W. [1 ]
Dai A.M. [1 ]
Uszkoreit J. [1 ]
Le Q. [1 ]
Petrov S. [1 ]
机构
[1] Google Research, United States
[2] Columbia University, United States
关键词
D O I
10.1162/tacl_a_00276
中图分类号
学科分类号
摘要
We present the Natural Questions corpus, a question answering data set. Questions consist of real anonymized, aggregated queries issued to the Google search engine. An annotator is presented with a question along with a Wikipedia page from the top 5 search results, and annotates a long answer (typically a paragraph) and a short answer (one or more entities) if present on the page, or marks null if no long/short answer is present. The public release consists of 307,373 training examples with single annotations; 7,830 examples with 5-way annotations for development data; and a further 7,842 examples with 5-way annotated sequestered as test data. We present experiments validating quality of the data. We also describe analysis of 25-way annotations on 302 examples, giving insights into human variability on the annotation task. We introduce robust metrics for the purposes of evaluating question answering systems; demonstrate high human upper bounds on these metrics; and establish baseline results using competitive methods drawn from related literature. © 2019 Association for Computational Linguistics. Distributed under a CC-BY 4.0 license.
引用
收藏
页码:453 / 466
页数:13
相关论文
共 50 条
  • [1] Natural Questions: A Benchmark for Question Answering Research
    Kwiatkowski, Tom
    Palomaki, Jennimaria
    Redfield, Olivia
    Collins, Michael
    Parikh, Ankur
    Alberti, Chris
    Epstein, Danielle
    Polosukhin, Illia
    Devlin, Jacob
    Lee, Kenton
    Toutanova, Kristina
    Jones, Llion
    Kelcey, Matthew
    Chang, Ming-Wei
    Dai, Andrew M.
    Uszkoreit, Jakob
    Quoc Le
    Petrov, Slav
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2019, 7 : 453 - 466
  • [2] TempQuestions: A Benchmark for Temporal Question Answering
    Jia, Zhen
    Abujabal, Abdalghani
    Roy, Rishiraj Saha
    Stroetgen, Jannik
    Weikum, Gerhard
    COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 1057 - 1062
  • [3] Routing Questions for Collaborative Answering in Community Question Answering
    Chang, Shuo
    Pal, Aditya
    2013 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2013, : 500 - 507
  • [4] Question and Answer Classification in Czech Question Answering Benchmark Dataset
    Kusnirakova, Dasa
    Medved, Marek
    Horak, Ales
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 701 - 706
  • [5] BEnQA: A Question Answering Benchmark for Bengali and English
    Shafayat, Sheikh
    Hasan, H. M. Quamran
    Mahim, Minhajur Rahman Chowdhury
    Putri, Rifki Afina
    Thorne, James
    Oh, Alice
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1158 - 1177
  • [6] TimelineQA: A Benchmark for Question Answering over Timelines
    Tan, Wang-Chiew
    Dwivedi-Yu, Jane
    Li, Yuliang
    Mathias, Lambert
    Saeidi, Marzieh
    Yan, Jing Nathan
    Halevy, Alon Y.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 77 - 91
  • [7] KoBBQ: Korean Bias Benchmark for Question Answering
    Jin, Jiho
    Kim, Jiseon
    Lee, Nayeon
    Yoo, Haneul
    Oh, Alice
    Lee, Hwaran
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 507 - 524
  • [8] Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering
    Maryam, Hiba
    Fu, Ling
    Song, Jiajun
    Shafayet, Tajrian A. B. M.
    Luo, Qidi
    Bai, Xiang
    Liu, Yuliang
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 279 - 292
  • [9] Localized Questions in Medical Visual Question Answering
    Tascon-Morales, Sergio
    Marquez-Neila, Pablo
    Sznitman, Raphael
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT II, 2023, 14221 : 361 - 370