AlphaIntellect at SemEval-2024 Task 6: Detection of Hallucinations in Generated Text

被引：0

作者：

Choudhury, Sohan ^{[1
]}

Saha, Priyam ^{[2
]}

Ray, Subharthi ^{[2
]}

Das, Shankha Shubhra ^{[2
]}

Das, Dipankar ^{[2
]}

机构：

[1] KIIT, Bhubaneswar, India

[2] Jadavpur Univ, Kolkata, India

来源：

PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

One major issue in natural language generation (NLG) models is detecting hallucinations (semantically inaccurate outputs). This study investigates a hallucination detection system designed for three distinct NLG tasks: definition modeling, paraphrase generation, and machine translation. The system uses feedforward neural networks for classification and SentenceTransformer models for similarity scores and sentence embeddings. Even though the SemEval-2024 benchmark is showing good results, there is still room for improvement. Promising paths towards improving performance include considering multi-task learning methods, including strategies for handling out-of-domain data and minimizing bias, and investigating sophisticated architectures.

引用

页码：952 / 958

页数：7

共 50 条

[31] Team Unibuc - NLP at SemEval-2024 Task 8: Transformer and Hybrid Deep Learning Based Models for Machine-Generated Text Detection
Marchitan, Teodor-George
Creanga, Claudiu
Dinu, Liviu P.
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 403 - 411
[32] RKadiyala at SemEval-2024 Task 8: Black-Box Word-Level Text Boundary Detection in Partially Machine Generated Texts
Kadiyala, Ram Mohan Rao
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 511 - 519
[33] Pollice Verso at SemEval-2024 Task 6: The Roman Empire Strikes Back
Kobs, Konstantin
Pfister, Jan
Hotho, Andreas
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1529 - 1536
[34] GeminiPro at SemEval-2024 Task 9: BrainTeaser on Gemini
Choi, Kyu-Hyun
Na, Eung-Hoon
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1602 - 1606
[35] AILS-NTUA at SemEval-2024 Task 6: Efficient model tuning for hallucination detection and analysis
Griogoriadou, Natalia
Lymperaiou, Maria
Filandrianos, Giorgos
Stamou, Giorgos
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1549 - 1560
[36] IUSTNLPLAB at SemEval-2024 Task 4: Multilingual Detection of Persuasion Techniques in Memes
Osoolian, Mohammad
Monazzah, Erfan Moosavi
Eetemadi, Sauleh
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1092 - 1096
[37] UMUTeam at SemEval-2024 Task 6: Leveraging Zero-Shot Learning for Detecting Hallucinations and Related Observable Overgeneration Mistakes
Pan, Ronghao
Antonio Garcia-Diaz, Jose
Bernal-Beltran, Tomas
Valencia-Garcia, Rafael
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 675 - 681
[38] ShefCDTeam at SemEval-2024 Task 4: A Text-to-Text Model for Multi-Label Classification
Gibbons, Meredith
Mi, Maggie
Villavicencio, Aline
Song, Xingyi
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 1860 - 1867
[39] I2C-Huelva at SemEval-2024 Task 8: Boosting AI-Generated Text Detection with Multimodal Models and Optimized Ensembles
Pena, Alberto Rodero
Vazquez, Jacinto Mata
Alvarez, Victoria Pachon
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 845 - 852
[40] TueSents at SemEval-2024 Task 8: Predicting the Shift from Human Authorship to Machine-generated Output in a Mixed Text
Pickard, Valentin
Do, Hoa
PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 829 - 832

← 1 2 3 4 5 →