DISFL-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

被引:0
|
作者
Gupta, Aditya [1 ]
Xu, Jiacheng [2 ,4 ]
Upadhyay, Shyam [1 ]
Yang, Diyi [3 ]
Faruqui, Manaal [1 ]
机构
[1] Google Assistant, Mountain View, CA USA
[2] Univ Texas Austin, Austin, TX 78712 USA
[3] Georgia Inst Technol, Atlanta, GA 30332 USA
[4] Google, Mountain View, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Disfluencies is an under-studied topic in NLP, even though it is ubiquitous in human conversation. This is largely due to the lack of datasets containing disfluencies. In this paper, we present a new challenge question answering dataset, DISFL-QA, a derivative of SQUAD, where humans introduce contextual disfluencies in previously fluent questions. DISFL- QA contains a variety of challenging disfluencies that require a more comprehensive understanding of the text than what was necessary in prior datasets. Experiments show that the performance of existing state-of-the-art question answering models degrades significantly when tested on DISFLQA in a zero-shot setting. We show data augmentation methods partially recover the loss in performance and also demonstrate the efficacy of using gold data for fine-tuning. We argue that we need large-scale disfluency datasets in order for NLP models to be robust to them. The dataset is publicly available at: https://github.com/ google-research-datasets/disfl-qa.
引用
收藏
页码:3309 / 3319
页数:11
相关论文
共 50 条
  • [41] BEnQA: A Question Answering Benchmark for Bengali and English
    Shafayat, Sheikh
    Hasan, H. M. Quamran
    Mahim, Minhajur Rahman Chowdhury
    Putri, Rifki Afina
    Thorne, James
    Oh, Alice
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1158 - 1177
  • [42] TimelineQA: A Benchmark for Question Answering over Timelines
    Tan, Wang-Chiew
    Dwivedi-Yu, Jane
    Li, Yuliang
    Mathias, Lambert
    Saeidi, Marzieh
    Yan, Jing Nathan
    Halevy, Alon Y.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 77 - 91
  • [43] KoBBQ: Korean Bias Benchmark for Question Answering
    Jin, Jiho
    Kim, Jiseon
    Lee, Nayeon
    Yoo, Haneul
    Oh, Alice
    Lee, Hwaran
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 507 - 524
  • [44] PubMedQA: A Dataset for Biomedical Research Question Answering
    Jin, Qiao
    Dhingra, Bhuwan
    Liu, Zhengping
    Cohen, William W.
    Lu, Xinghua
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2567 - 2577
  • [45] ArabicaQA: A Comprehensive Dataset for Arabic Question Answering
    Abdallah, Abdelrahman
    Kasem, Mahmoud
    Abdalla, Mahmoud
    Mahmoud, Mohamed
    Elkasaby, Mohamed
    Elbendary, Yasser
    Jatowt, Adam
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2049 - 2059
  • [46] VQuAD: Video Question Answering Diagnostic Dataset
    Gupta, Vivek
    Patro, Badri N.
    Parihar, Hemant
    Namboodiri, Vinay P.
    2022 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2022), 2022, : 282 - 291
  • [47] TutorialVQA: Question Answering Dataset for Tutorial Videos
    Colas, Anthony
    Kim, Seokhwan
    Dernoncourt, Franck
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5450 - 5455
  • [48] Towards a Polish Question Answering Dataset (PoQuAD)
    Tuora, Ryszard
    Zawadzka-Paluektau, Natalia
    Klamra, Cezary
    Zwierzchowska, Aleksandra
    Kobylinski, Lukasz
    FROM BORN-PHYSICAL TO BORN-VIRTUAL: AUGMENTING INTELLIGENCE IN DIGITAL LIBRARIES, ICADL 2022, 2022, 13636 : 194 - 203
  • [49] PerCQA: Persian Community Question Answering Dataset
    Jamali, Naghme
    Yaghoobzadeh, Yadollah
    Faili, Heshaam
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6083 - 6092
  • [50] PRAGMATICQA: A Dataset for Pragmatic Question Answering in Conversations
    Qi, Peng
    Du, Nina
    Manning, Christopher D.
    Huang, Jing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 6175 - 6191