Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation

被引：0

|

作者：

University of Calabria, Italy ^{[1
]}

机构：

来源：

arXiv | 1600年

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Artificial intelligence

引用

收藏

相关论文

共 50 条

[41] Using Large Language Models to Investigate and Categorize Bias in Clinical Documentation
Apakama, D.
Klang, E.
Richardson, L.
Nadkarni, G.
ANNALS OF EMERGENCY MEDICINE, 2024, 84 (04) : S96 - S97
[42] Likelihood-based Mitigation of Evaluation Bias in Large Language Models
Ohi, Masanari
Kaneko, Masahiro
Koike, Ryuto
Loem, Mengsay
Okazaki, Naoaki
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3237 - 3245
[43] Understanding the Effect of Model Compression on Social Bias in Large Language Models
Goncalves, Gustavo
Strubell, Emma
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2663 - 2675
[44] Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning
Rytting, Christopher Michael
Wingate, David
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[45] Predicting startup success using two bias-free machine learning: resolving data imbalance using generative adversarial networks
Park, Jungryeol
Choi, Saesol
Feng, Yituo
JOURNAL OF BIG DATA, 2024, 11 (01)
[46] Human bias in AI models? Anchoring effects and mitigation strategies in large language models
Nguyen, Jeremy K.
JOURNAL OF BEHAVIORAL AND EXPERIMENTAL FINANCE, 2024, 43
[47] Pilot study on large language models for risk-of-bias assessments in systematic reviews: A(I) new type of bias?
Barsby, Joseph
Hume, Samuel
Lemmey, Hamish A. L.
Cutteridge, Joseph
Lee, Regent
Bera, Katarzyna D.
BMJ EVIDENCE-BASED MEDICINE, 2024,
[48] A bias-free least-squares parameter estimator for continuous-time state-space models
Garnier, H
Sibille, P
Bastogne, T
PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 1860 - 1865
[49] Communicating the cultural other: trust and bias in generative AI and large language models
Jenks, Christopher J.
APPLIED LINGUISTICS REVIEW, 2025, 16 (02) : 787 - 795
[50] Bias Unveiled: Enhancing Fairness in German Word Embeddings with Large Language Models
Saeid, Yasser
Kopinski, Thomas
SPEECH AND COMPUTER, SPECOM 2024, PT II, 2025, 15300 : 308 - 325

← 1 2 3 4 5 →