Mitigating Exaggerated Safety in Large Language Models

被引:0
|
作者
Ray, Ruchira [1 ]
Bhalani, Ruchi [1 ]
机构
[1] University of Texas at Austin, Department of Computer Science, United States
来源
关键词
Compilation and indexing terms; Copyright 2025 Elsevier Inc;
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
  • [1] Locating and Mitigating Gender Bias in Large Language Models
    Cai, Yuchen
    Cao, Ding
    Guo, Rongxi
    Wen, Yaqin
    Liu, Guiquan
    Chen, Enhong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IV, ICIC 2024, 2024, 14878 : 471 - 482
  • [2] Mitigating Factual Inconsistency and Hallucination in Large Language Models
    Muneeswaran, I
    Shankar, Advaith
    Varun, V.
    Gopalakrishnan, Saisubramaniam
    Vaddina, Vishal
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1169 - 1170
  • [3] Evaluating and Mitigating Gender Bias in Generative Large Language Models
    Zhou, H.
    Inkpen, D.
    Kantarci, B.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2024, 19 (06)
  • [4] Respond in my Language: Mitigating Language Inconsistency in Response Generation based on Large Language Models
    Zhang, Liang
    Jin, Qin
    Huang, Haoyang
    Zhang, Dongdong
    Wei, Furu
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 4177 - 4192
  • [5] SafetyBench: Evaluating the Safety of Large Language Models
    Zhang, Zhexin
    Lei, Leqi
    Wu, Lindong
    Sun, Rui
    Huang, Yongkang
    Long, Chong
    Liu, Xiao
    Lei, Xuanyu
    Tang, Jie
    Huang, Minlie
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 15537 - 15553
  • [6] Safety of Large Language Models in Addressing Depression
    Heston, Thomas F.
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (12)
  • [7] On large language models safety, security, and privacy: A survey
    Ran Zhang
    Hong-Wei Li
    Xin-Yuan Qian
    Wen-Bo Jiang
    Han-Xiao Chen
    Journal of Electronic Science and Technology, 2025, 23 (01) : 3 - 23
  • [8] Leverage Large Language Models For Enhanced Aviation Safety
    Fox, Kevin L.
    Niewoehner, Kevin R.
    Rahmes, Mark
    Wong, Josiah
    Razdan, Rahul
    2024 INTEGRATED COMMUNICATIONS, NAVIGATION AND SURVEILLANCE CONFERENCE, ICNS, 2024,
  • [9] Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
    Li, Mingda
    Li, Xinyu
    Chen, Yifan
    Xuan, Wenfeng
    Zhang, Weinan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 4833 - 4850
  • [10] Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal
    Huang, Jianheng
    Cui, Leyang
    Wang, Ante
    Yang, Chengyi
    Liao, Xinting
    Song, Linfeng
    Yao, Junfeng
    Su, Jinsong
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1416 - 1428