SleepQA: A Health Coaching Dataset on Sleep for Extractive Question Answering

被引:0
|
作者
Bojic, Iva [1 ]
Ong, Qi Chwen [1 ]
Thakkar, Megh [1 ]
Kamran, Esha [2 ]
Le Shua, Irving Yu [1 ]
Pang, Jaime Rei Ern [1 ]
Chen, Jessica [2 ]
Nayak, Vaaruni [2 ]
Joty, Shafiq [1 ,3 ]
Car, Josip [1 ,2 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] Imperial Coll London, London, England
[3] Salesforce Res, Washington, DC USA
来源
关键词
Factual Question Answering; Dense Passage Retrieval; Evidence-based Knowledge; Domain-specific Natural Language Processing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question Answering (QA) systems can support health coaches in facilitating clients' lifestyle behavior changes (e.g., in adopting healthy sleep habits). In this paper, we design a domain-specific QA pipeline for sleep coaching. To this end, we release SleepQA, a dataset created from 7,005 passages comprising 4,250 training examples with single annotations and 750 examples with 5-way annotations(1). We fine-tuned different domain-specific BERT models on our dataset and perform extensive automatic and human evaluation of the resulting end-to-end QA pipeline. Comparisons of our pipeline with baseline show improvements in domain-specific natural language processing on real-world questions. We hope that this dataset will lead to wider research interest in this important health domain.
引用
收藏
页码:199 / 217
页数:19
相关论文
共 50 条
  • [1] Enhancing yes/no question answering with weak supervision via extractive question answering
    Dimitris Dimitriadis
    Grigorios Tsoumakas
    Applied Intelligence, 2023, 53 : 27560 - 27570
  • [2] Enhancing yes/no question answering with weak supervision via extractive question answering
    Dimitriadis, Dimitris
    Tsoumakas, Grigorios
    APPLIED INTELLIGENCE, 2023, 53 (22) : 27560 - 27570
  • [3] Automatic question answering for multiple stakeholders, the epidemic question answering dataset
    Travis R. Goodwin
    Dina Demner-Fushman
    Kyle Lo
    Lucy Lu Wang
    Hoa T. Dang
    Ian M. Soboroff
    Scientific Data, 9
  • [4] Automatic question answering for multiple stakeholders, the epidemic question answering dataset
    Goodwin, Travis R.
    Demner-Fushman, Dina
    Lo, Kyle
    Wang, Lucy Lu
    Dang, Hoa T.
    Soboroff, Ian M.
    SCIENTIFIC DATA, 2022, 9 (01)
  • [5] Sequence tagging for biomedical extractive question answering
    Yoon, Wonjin
    Jackson, Richard
    Lagerberg, Aron
    Kang, Jaewoo
    BIOINFORMATICS, 2022, 38 (15) : 3794 - 3801
  • [6] QUASER: Question Answering with Scalable Extractive Rationalization
    Ghoshal, Asish
    Iyer, Srinivasan
    Paranjape, Bhargavi
    Lakhotia, Kushal
    Yih, Scott Wen-tau
    Mehdad, Yashar
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1208 - 1218
  • [7] QookA: A Cooking Question Answering Dataset
    Frummet, Alexander
    Elsweiler, David
    PROCEEDINGS OF THE 2024 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL, CHIIR 2024, 2024, : 406 - 410
  • [8] PQuAD: A Persian question answering dataset
    Darvishi, Kasra
    Shahbodaghkhan, Newsha
    Abbasiantaeb, Zahra
    Momtazi, Saeedeh
    COMPUTER SPEECH AND LANGUAGE, 2023, 80
  • [9] FQuAD: French Question Answering Dataset
    d'Hoffschmidt, Martin
    Belblidia, Wacim
    Heinrich, Quentin
    Brendle, Tom
    Vidal, Maxime
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1193 - 1208
  • [10] Slovak Dataset for Multilingual Question Answering
    Hladek, Daniel
    Stas, Jan
    Juhar, Jozef
    Koctur, Tomas
    IEEE ACCESS, 2023, 11 : 32869 - 32881