Does Noise Really Matter? Investigation into the Influence of Noisy Labels on BERT-Based Question Answering System

被引：1

作者：

Alexandrov, Dmitriy ^{[1
]}

Zakharova, Anastasiia ^{[1
]}

Butakov, Nikolay ^{[1
]}

机构：

[1] ITMO Univ, St Petersburg, Russia

来源：

INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING | 2024年 / 18卷 / 01期

关键词：

Question-answering system; BERT; noisy data; noise simulation;

D O I：

10.1142/S1793351X24410046

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent works with the BERT-based models demonstrate their generalization ability and high performance on the new domain tasks. However, this kind of model requires a large amount of data. Collecting this data can be error-prone, and it is important to know: how the errors in data affect the quality of the model. In this work, we study the impact of data with different errors- noisy data on the training of the question answering-over-text BERT-model. We use the concept of random, structural and irrelevant question noises. We study the robustness of QAT models during the training process with different settings, datasets and noise types and discuss possible reasons. We also propose a real-world domain dataset to probe our findings in a real-world scenario. The results of an experimental study showed that following developed recommendations allowed performance improvement up to 3.6% in a real-world setting.

引用

页码：77 / 96

页数：20

共 16 条

[1] Does Noise Really Matter? Investigation into the Influence of Noisy Labels on Bert-Based Question Answering System
Alexandrov, Dmitriy
Zakharova, Anastasiia
Butakov, Nikolay
2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 33 - 40
[2] A BERT-Based Model for Question Answering on Construction Incident Reports
Hassan, Hebatallah A. Mohamed
Marengo, Elisa
Nutt, Werner
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2022), 2022, 13286 : 215 - 223
[3] Quantifying confidence shifts in a BERT-based question answering system evaluated on perturbed instances
Shen, Ke
Kejriwal, Mayank
PLOS ONE, 2023, 18 (12):
[4] BB-KBQA: BERT-Based Knowledge Base Question Answering
Liu, Aiting
Huang, Ziqi
Lu, Hengtong
Wang, Xiaojie
Yuan, Caixia
CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 81 - 92
[5] A BERT-Based Semantic Matching Ranker for Open-Domain Question Answering
Xu, Shiyi
Liu, Feng
Huang, Zhen
Peng, Yuxing
Li, Dongsheng
2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020, 2020, : 31 - 36
[6] A BERT-based Approach with Relation-aware Attention for Knowledge Base Question Answering
Luo, Da
Su, Jindian
Yu, Shanshan
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[7] An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering
Kuo, Chia-Chih
Luo, Shang-Bao
Chen, Kuan-Yu
INTERSPEECH 2020, 2020, : 4173 - 4177
[8] BIRD-QA: A BERT-based Information Retrieval Approach to Domain Specific Question Answering
Chen, Yuhao
Zulkernine, Farhana
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 3503 - 3510
[9] Improving BERT-based FAQ Retrieval System using Query, Question and Answer Simultaneously
Cho, Hyunsoo
Choi, Jiwon
Noh, Changju
Park, Kwang-Hyun
38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 730 - 734
[10] Marie and BERT-A Knowledge Graph Embedding Based Question Answering System for Chemistry
Zhou, Xiaochi
Zhang, Shaocong
Agarwal, Mehal
Akroyd, Jethro
Mosbach, Sebastian
Kraft, Markus
ACS OMEGA, 2023, 8 (36): : 33039 - 33057

← 1 2 →