共 50 条
- [4] SafetyBench: Evaluating the Safety of Large Language Models PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 15537 - 15553
- [7] EVALUATING LARGE LANGUAGE MODELS ON THEIR ACCURACY AND COMPLETENESS RETINA-THE JOURNAL OF RETINAL AND VITREOUS DISEASES, 2025, 45 (01): : 128 - 132
- [8] Evaluating Intelligence and Knowledge in Large Language Models TOPOI-AN INTERNATIONAL REVIEW OF PHILOSOPHY, 2025, 44 (01): : 163 - 173
- [10] AUGMENTING AUTOTELIC AGENTS WITH LARGE LANGUAGE MODELS CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 205 - 226