Efficient Detection of Toxic Prompts in Large Language Models

被引:0
|
作者
Liu, Yi [1 ]
Yu, Junzhe [2 ]
Sun, Huijia [2 ]
Shi, Ling [1 ]
Deng, Gelei [1 ]
Chen, Yuqi [2 ]
Liu, Yang [1 ]
机构
[1] Nanyang Technological University, Singapore, Singapore
[2] ShanghaiTech University, Shanghai, China
来源
arXiv | 1600年
关键词
D O I
暂无
中图分类号
学科分类号
摘要
49
引用
收藏
相关论文
共 50 条
  • [21] Finetuning Large Language Models for Vulnerability Detection
    Shestov, Aleksei
    Levichev, Rodion
    Mussabayev, Ravil
    Maslov, Evgeny
    Zadorozhny, Pavel
    Cheshkov, Anton
    Mussabayev, Rustam
    Toleu, Alymzhan
    Tolegen, Gulmira
    Krassovitskiy, Alexander
    IEEE ACCESS, 2025, 13 : 38889 - 38900
  • [22] Detection avoidance techniques for large language models
    Schneider, Sinclair
    Steuber, Florian
    Schneider, Joao A. G.
    Rodosek, Gabi Dreo
    DATA & POLICY, 2025, 7
  • [23] Dynamic Voting for Efficient Reasoning in Large Language Models
    Xue, Mingfeng
    Liu, Dayiheng
    Lei, Wenqiang
    Ren, Xingzhang
    Yang, Baosong
    Xie, Jun
    Zhang, Yidan
    Peng, Dezhong
    Lv, Jiancheng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3085 - 3104
  • [24] Probing Toxic Content in Large Pre-Trained Language Models
    Ousidhoum, Nedjma
    Zhao, Xinran
    Fang, Tianqing
    Song, Yangqiu
    Yeung, Dit-Yan
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4262 - 4274
  • [25] Demystifying Prompts in Language Models via Perplexity Estimation
    Gonen, Hila
    Iyer, Srini
    Blevins, Terra
    Smith, Noah A.
    Zettlemoyer, Luke
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10136 - 10148
  • [26] Verbal lie detection using Large Language Models
    Loconte, Riccardo
    Russo, Roberto
    Capuozzo, Pasquale
    Pietrini, Pietro
    Sartori, Giuseppe
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [27] Chain of Stance: Stance Detection with Large Language Models
    Ma, Junxia
    Wang, Changjiang
    Xing, Hanwen
    Zhao, Dongming
    Zhang, Yazhou
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT V, NLPCC 2024, 2025, 15363 : 82 - 94
  • [28] Contextual Object Detection with Multimodal Large Language Models
    Zang, Yuhang
    Li, Wei
    Han, Jun
    Zhou, Kaiyang
    Loy, Chen Change
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (02) : 825 - 843
  • [29] Limitations of Large Language Models in Propaganda Detection Task
    Szwoch, Joanna
    Staszkow, Mateusz
    Rzepka, Rafal
    Araki, Kenji
    APPLIED SCIENCES-BASEL, 2024, 14 (10):
  • [30] Explaining Misinformation Detection Using Large Language Models
    Pendyala, Vishnu S.
    Hall, Christopher E.
    ELECTRONICS, 2024, 13 (09)