共 50 条
- [1] Towards a benchmark dataset for large language models in the context of process automation DIGITAL CHEMICAL ENGINEERING, 2024, 13
- [4] CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18952 - 18960
- [6] Causal Dataset Discovery with Large Language Models WORKSHOP ON HUMAN-IN-THE-LOOP DATA ANALYTICS, HILDA 2024, 2024,
- [7] Construction of a Japanese Financial Benchmark for Large Language Models Jt. Workshop Financ. Technol. Nat. Lang. Process., Knowl. Discov. from Unstructured Data Financ. Serv. Econ. Nat. Lang. Process., FinNLP-KDF-ECONLP LREC-COLING - Workshop Proc., (1-9):
- [8] HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6449 - 6464
- [9] Understanding the Dataset Practitioners Behind Large Language Models EXTENDED ABSTRACTS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2024, 2024,
- [10] A Chinese Dataset for Evaluating the Safeguards in Large Language Models FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3106 - 3119