共 50 条
- [22] Benchmarking Large Language Models for Automated Verilog RTL Code Generation 2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
- [23] Benchmarking Large Language Models on Controllable Generation under Diversified Instructions THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17808 - 17816
- [24] Benchmarking Causal Study to Interpret Large Language Models for Source Code 2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME, 2023, : 329 - 334
- [25] StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11143 - 11156
- [26] Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1624 - 1637
- [27] EchoSwift An Inference Benchmarking and Configuration Discovery Tool for Large Language Models (LLMs) COMPANION OF THE 15TH ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE COMPANION 2024, 2024, : 158 - 162
- [28] Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study IEEE ACCESS, 2025, 13 : 29698 - 29717
- [29] (sic) UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 5266 - 5293