Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation

被引:0
|
作者
University of Calabria, Italy [1 ]
机构
来源
arXiv | 1600年
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Artificial intelligence
引用
收藏
相关论文
共 50 条
  • [31] A survey on multilingual large language models: corpora, alignment, and bias
    Xu, Yuemei
    Hu, Ling
    Zhao, Jiayi
    Qiu, Zihan
    Xu, Kexin
    Ye, Yuqi
    Gu, Hanwen
    FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (11)
  • [32] ROBBIE: Robust Bias Evaluation of Large Generative Language Models
    Esiobu, David
    Tan, Xiaoqing
    Hosseini, Saghar
    Ung, Megan
    Zhang, Yuchen
    Fernandes, Jude
    Dwivedi-Yu, Jane
    Presani, Eleonora
    Williams, Adina
    Meta, Eric Michael Smith
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3764 - 3814
  • [33] Quantifying Bias in Agentic Large Language Models: A Benchmarking Approach
    Fernando, Riya
    Norton, Isabel
    Dogra, Pranay
    Sarnaik, Rohit
    Wazir, Hasan
    Ren, Zitang
    Gunda, Niveta Sree
    Mukhopadhyay, Anushka
    Lutz, Michael
    2024 5TH INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE, ICTC 2024, 2024, : 349 - 353
  • [34] Persistent Anti-Muslim Bias in Large Language Models
    Abid, Abubakar
    Farooqi, Maheen
    Zou, James
    AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, : 298 - 306
  • [35] Evaluating and Mitigating Gender Bias in Generative Large Language Models
    Zhou, H.
    Inkpen, D.
    Kantarci, B.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2024, 19 (06)
  • [36] LARGE LANGUAGE MODELS FOR RISK OF BIAS ASSESSMENT: A CASE STUDY
    Edwards, M.
    Bishop, E.
    Reddish, K.
    Carr, E.
    di Ruffano, L. Ferrante
    VALUE IN HEALTH, 2024, 27 (12)
  • [37] Likelihood Functions for Errors-in-variables Models Bias-free Local Estimation with Minimum Variance
    Krajsek, Kai
    Heinemann, Christian
    Scharr, Hanno
    PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 3, 2014, : 270 - 279
  • [38] Adversarial Robustness for Large Language NER models using Disentanglement and Word Attributions
    Jin, Xiaomeng
    Vinzamuri, Bhanukiran
    Venkatapathy, Sriram
    Ji, Heng
    Natarajan, Pradeep
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12437 - 12450
  • [39] Evaluating Nuanced Bias in Large Language Model Free Response Answers
    Healey, Jennifer
    Byrum, Laurie
    Akhtar, Md Nadeem
    Sinha, Moumita
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT II, NLDB 2024, 2024, 14763 : 378 - 391
  • [40] Implicit bias in large language models: Experimental proof and implications for education
    Warr, Melissa
    Oster, Nicole Jakubczyk
    Isaac, Roger
    JOURNAL OF RESEARCH ON TECHNOLOGY IN EDUCATION, 2024,