A Serbian Question Answering Dataset Created by Using the Web Scraping Technique

被引:0
|
作者
Cenic, Aleksandar B. [1 ]
Stojkovic, Suzana [1 ]
机构
[1] Univ Nis, Fac Elect Engn, Aleksandra Medvedeva 14, Nish 18000, Serbia
关键词
Question answering system; Web scraping; Question answering dataset; EXTRACTION;
D O I
10.1109/ICEST58410.2023.10187370
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Every artificial intelligence task requires a particular dataset to train the model and test it. As the expansion of the field of AI accelerates, data is becoming a critical resource. Natural language processing is a specific field in artificial intelligence that requires separate datasets for each task and each processed language. This paper describes the process of collecting a dataset for a question answering system in the Serbian language. Data collection was achieved using the Web scraping method. The Web scraper was implemented in the Python programming language. The resulting dataset contains 16374 questions and answers in 6 different fields: history, biology, geography, physics, chemistry, and mathematics.
引用
收藏
页码:147 / 150
页数:4
相关论文
共 50 条
  • [11] LLQA - Lifelog Question Answering Dataset
    Tran, Ly-Duyen
    Thanh Cong Ho
    Lan Anh Pham
    Binh Nguyen
    Gurrin, Cathal
    Zhou, Liting
    MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 217 - 228
  • [12] Automatic question answering using the web: Beyond the Factoid
    Radu Soricut
    Eric Brill
    Information Retrieval, 2006, 9 : 191 - 206
  • [13] Automatic Question Answering using the Web: Beyond the factoid
    Soricut, R
    Brill, E
    INFORMATION RETRIEVAL, 2006, 9 (02): : 191 - 206
  • [14] Automated question answering using semantic web services
    Jang, Minsu
    Sohn, Joo-Chan
    Cho, Hyun Kyu
    2ND IEEE ASIA-PACIFIC SERVICES COMPUTING CONFERENCE, PROCEEDINGS, 2007, : 344 - 348
  • [15] Scaling question answering to the Web
    Kwok, C
    Etzioni, O
    Weld, D
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2001, 19 (03) : 242 - 262
  • [16] Web Logs and Question Answering
    Sutcliffe, Richard F. E.
    Kruschwitz, Udo
    Mandl, Thomas
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : D1 - D7
  • [17] Probabilistic question answering on the Web
    Radev, D
    Fan, WG
    Qi, H
    Wu, H
    Grewal, A
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2005, 56 (06): : 571 - 583
  • [18] Question answering on the semantic web
    McGuinness, DL
    IEEE INTELLIGENT SYSTEMS, 2004, 19 (01): : 82 - 85
  • [19] ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
    Yu, Zhou
    Xu, Dejing
    Yu, Jun
    Yu, Ting
    Zhao, Zhou
    Zhuang, Yueting
    Tao, Dacheng
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9127 - 9134
  • [20] LyricScraper: A Dataset of Spanish Song Lyrics Created via Web Scraping and Dual-labeling for LLM Classification
    Alcantara, Tania
    Garcia-Vazquez, Omar
    Hernandez, Mayte
    Calvo, Hiram
    Desiderio, Alan
    COMPUTACION Y SISTEMAS, 2024, 28 (04): : 2251 - 2260