AttractionDetailsQA: An Attraction Details Focused on Chinese Question Answering Dataset

被引:1
|
作者
Huang, Weiming [1 ,2 ]
Xu, Shiting [3 ]
Wang Yuhan [4 ]
Jin Fan [1 ,2 ]
Chang, Qingling [1 ,2 ]
机构
[1] Wuyi Univ, Fac Intelligent Mfg, Jiangmen 529000, Peoples R China
[2] China Germany Artificial Intelligence Inst Jiangm, Jiangmen 529000, Peoples R China
[3] Zhuhai 4DAGE Technol Co Ltd, Zhuhai 519000, Peoples R China
[4] Jiangsu Univ Sci & Technol, Sch Naval Architecture & Ocean Engn, Zhenjiang 212003, Jiangsu, Peoples R China
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Annotations; Data models; Question answering (information retrieval); Manuals; Layout; Benchmark testing; Tourism industry; Attraction detail dataset; question-answering pair generation;
D O I
10.1109/ACCESS.2022.3181188
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the increase in the number of domestic tourists and the popularity of digital upgrades in attractions, it is crucial to develop a question-answering(QA) system about the details of the attractions. However, there is little work on attractions QA, and the main bottleneck is the lack of available datasets. While previous QA datasets usually focus on news domain like CNN/DAILYMAIL and NewsQA, we present the first large-scale dataset for QA over attraction details. To ensure that the data we collected are useful, we only gather the data from public travel information website. Unlike other QA datasets like SQuAD, which is labeled manually, we formed the dataset by manual and question-answer pair generation(QAG) annotated model. Finally, we obtained a dataset covering 2,808 attractions with a total of 18,245 QA pairs, including seven types of attraction details: location, time, component, area, layout, rating, and character. The dataset is available at https://github.com/wyman130/AttractionDetailsQA. Considering that QAG has not been much studied in attraction details, we experimented some QAG models on this dataset and obtained the benchmark. This provides a basis for subsequent improvements to the dataset and research on QAG in attraction details.
引用
收藏
页码:86215 / 86221
页数:7
相关论文
共 50 条
  • [21] QED: A Framework and Dataset for Explanations in Question Answering
    Lamm, Matthew
    Palomaki, Jennimaria
    Alberti, Chris
    Andor, Daniel
    Choi, Eunsol
    Soares, Livio Baldini
    Collins, Michael
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 790 - 806
  • [22] PerCQA: Persian Community Question Answering Dataset
    Jamali, Naghme
    Yaghoobzadeh, Yadollah
    Faili, Heshaam
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6083 - 6092
  • [23] PRAGMATICQA: A Dataset for Pragmatic Question Answering in Conversations
    Qi, Peng
    Du, Nina
    Manning, Christopher D.
    Huang, Jing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 6175 - 6191
  • [24] MemoriQA: A Question-Answering Lifelog Dataset
    Tran, Quang-Linh
    Nguyen, Binh
    Jones, Gareth J. F.
    Gurrin, Cathal
    PROCEEDINGS OF THE FIRST ACM WORKSHOP ON AI-POWERED QUESTION ANSWERING SYSTEMS FOR MULTIMEDIA, AIQAM 2024, 2024, : 7 - 12
  • [25] SYLLABUSQA: A Course Logistics Question Answering Dataset
    Fernandez, Nigel
    Scarlatos, Alexander
    Lan, Andrew
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 10344 - 10369
  • [26] A Portuguese Dataset for Evaluation of Semantic Question Answering
    de Araujo, Denis Andrei
    Rigo, Sandro Jose
    Quaresma, Paulo
    Muniz, Joao Henrique
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2020, 2020, 12037 : 217 - 227
  • [27] FOCUSED SEARCH OF SEMANTIC CASES IN QUESTION ANSWERING
    SINGER, M
    PARBERY, G
    JAKOBSON, LS
    MEMORY & COGNITION, 1988, 16 (02) : 147 - 157
  • [28] FOCUSED SEARCH OF SEMANTIC CASES IN QUESTION ANSWERING
    SINGER, MR
    PARBERY, GE
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1985, 23 (04) : 294 - 294
  • [29] Single-dataset Experts for Multi-dataset Question Answering
    Friedman, Dan
    Dodge, Ben
    Chen, Danqi
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6128 - 6137
  • [30] EQUALS: A Real-world Dataset for Legal Question Answering via Reading Chinese Laws
    Chen, Andong
    Yao, Feng
    Zhao, Xinyan
    Zhang, Yating
    Sun, Changlong
    Liu, Yun
    Shen, Weixing
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND LAW, ICAIL 2023, 2023, : 71 - 80