Does Noise Really Matter? Investigation into the Influence of Noisy Labels on BERT-Based Question Answering System

被引:1
|
作者
Alexandrov, Dmitriy [1 ]
Zakharova, Anastasiia [1 ]
Butakov, Nikolay [1 ]
机构
[1] ITMO Univ, St Petersburg, Russia
关键词
Question-answering system; BERT; noisy data; noise simulation;
D O I
10.1142/S1793351X24410046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent works with the BERT-based models demonstrate their generalization ability and high performance on the new domain tasks. However, this kind of model requires a large amount of data. Collecting this data can be error-prone, and it is important to know: how the errors in data affect the quality of the model. In this work, we study the impact of data with different errors- noisy data on the training of the question answering-over-text BERT-model. We use the concept of random, structural and irrelevant question noises. We study the robustness of QAT models during the training process with different settings, datasets and noise types and discuss possible reasons. We also propose a real-world domain dataset to probe our findings in a real-world scenario. The results of an experimental study showed that following developed recommendations allowed performance improvement up to 3.6% in a real-world setting.
引用
收藏
页码:77 / 96
页数:20
相关论文
共 16 条
  • [1] Does Noise Really Matter? Investigation into the Influence of Noisy Labels on Bert-Based Question Answering System
    Alexandrov, Dmitriy
    Zakharova, Anastasiia
    Butakov, Nikolay
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 33 - 40
  • [2] A BERT-Based Model for Question Answering on Construction Incident Reports
    Hassan, Hebatallah A. Mohamed
    Marengo, Elisa
    Nutt, Werner
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2022), 2022, 13286 : 215 - 223
  • [3] Quantifying confidence shifts in a BERT-based question answering system evaluated on perturbed instances
    Shen, Ke
    Kejriwal, Mayank
    PLOS ONE, 2023, 18 (12):
  • [4] BB-KBQA: BERT-Based Knowledge Base Question Answering
    Liu, Aiting
    Huang, Ziqi
    Lu, Hengtong
    Wang, Xiaojie
    Yuan, Caixia
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 81 - 92
  • [5] A BERT-Based Semantic Matching Ranker for Open-Domain Question Answering
    Xu, Shiyi
    Liu, Feng
    Huang, Zhen
    Peng, Yuxing
    Li, Dongsheng
    2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020, 2020, : 31 - 36
  • [6] A BERT-based Approach with Relation-aware Attention for Knowledge Base Question Answering
    Luo, Da
    Su, Jindian
    Yu, Shanshan
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [7] An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering
    Kuo, Chia-Chih
    Luo, Shang-Bao
    Chen, Kuan-Yu
    INTERSPEECH 2020, 2020, : 4173 - 4177
  • [8] BIRD-QA: A BERT-based Information Retrieval Approach to Domain Specific Question Answering
    Chen, Yuhao
    Zulkernine, Farhana
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 3503 - 3510
  • [9] Improving BERT-based FAQ Retrieval System using Query, Question and Answer Simultaneously
    Cho, Hyunsoo
    Choi, Jiwon
    Noh, Changju
    Park, Kwang-Hyun
    38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 730 - 734
  • [10] Marie and BERT-A Knowledge Graph Embedding Based Question Answering System for Chemistry
    Zhou, Xiaochi
    Zhang, Shaocong
    Agarwal, Mehal
    Akroyd, Jethro
    Mosbach, Sebastian
    Kraft, Markus
    ACS OMEGA, 2023, 8 (36): : 33039 - 33057