共 50 条
- [32] Mitigating Privacy Seesaw in Large Language Models: Augmented Privacy Neuron Editing via Activation Patching FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5319 - 5332
- [33] Towards Understanding and Mitigating Social Biases in Language Models INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
- [34] Potential use of large language models for mitigating students' problematic social media use: ChatGPT as an example WORLD JOURNAL OF PSYCHIATRY, 2024, 14 (03):
- [36] Knowledge Unlearning for Mitigating Privacy Risks in Language Models PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14389 - 14408
- [38] MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models COMPUTER VISION - ECCV 2024, PT LVI, 2025, 15114 : 386 - 403
- [40] Identifying Exaggerated Language PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7024 - 7034