MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization

被引:0
|
作者
Yang, Zhiyu [4 ]
Zhou, Zihan [5 ]
Wang, Shuo [1 ]
ConG, Xin [1 ,2 ,3 ]
Han, Xu [1 ,2 ,3 ]
Yan, Yukun [1 ]
Liu, Zhenghao [6 ]
Tan, Zhixing [7 ]
Liu, Pengyuan [4 ]
Yu, Dong [4 ]
Liu, Zhiyuan [1 ,2 ,3 ]
Shi, Xiaodong [5 ]
Sun, Maosong [1 ,2 ,3 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Tech, Beijing, Peoples R China
[2] Tsinghua Univ, Inst AI, Beijing, Peoples R China
[3] Beijing Natl Res Ctr Informat Sci & Technol, Beijing, Peoples R China
[4] Beijing Language & Culture Univ, Beijing, Peoples R China
[5] Xiamen Univ, Xiamen, Peoples R China
[6] Northeastern Univ, Shenyang, Peoples R China
[7] Zhongguancun Lab, Beijing, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Scientific data visualization plays a crucial role in research by enabling the direct display of complex information and assisting researchers in identifying implicit patterns. Despite its importance, the use of Large Language Models (LLMs) for scientific data visualization remains rather unexplored. In this study, we introduce MatPlotAgent, an efficient modelagnostic LLM agent framework designed to automate scientific data visualization tasks. Leveraging the capabilities of both code LLMs and multi-modal LLMs, MatPlotAgent consists of three core modules: query understanding, code generation with iterative debugging, and a visual feedback mechanism for error correction. To address the lack of benchmarks in this field, we present MatPlotBench, a high-quality benchmark consisting of 100 human-verified test cases. Additionally, we introduce a scoring approach that utilizes GPT-4V for automatic evaluation. Experimental results demonstrate that MatPlotAgent can improve the performance of various LLMs, including both commercial and open-source models. Furthermore, the proposed evaluation method shows a strong correlation with human-annotated scores.
引用
收藏
页码:11789 / 11804
页数:16
相关论文
共 50 条
  • [1] LLM-based agentic systems in medicine and healthcare
    Qiu, Jianing
    Lam, Kyle
    Li, Guohao
    Acharya, Amish
    Wong, Tien Yin
    Darzi, Ara
    Yuan, Wu
    Topol, Eric J.
    NATURE MACHINE INTELLIGENCE, 2024, 6 (12) : 1418 - 1420
  • [2] LLM-based Vulnerability Sourcing from Unstructured Data
    Ashiwal, Virendra
    Finster, Soeren
    Dawoud, Abdallah
    9TH IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS, EUROS&PW 2024, 2024, : 634 - 641
  • [3] An LLM-based Knowledge Synthesis and Scientific Reasoning Framework for Biomedical Discovery
    Wysocki, Oskar
    Wysocka, Magdalena
    Carvalho, Danilo S.
    Bogatu, Alex
    Gusicuma, Danilo
    Delmas, Maxime
    Unsworth, Harriet
    Freitas, Andre
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 355 - 364
  • [4] Challenges and Opportunities of LLM-Based Synthetic Personae and Data in HCI
    Prpa, Mirjana
    Troiano, Giovanni
    Wood, Matthew
    Coady, Yvonne
    EXTENDED ABSTRACTS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2024, 2024,
  • [5] A Quantitative and Qualitative Evaluation of LLM-Based Explainable Fault Localization
    Kang, Sungmin
    An, Gabin
    Yoo, Shin
    arXiv, 2023,
  • [6] LLM-Based Code Generation Method for Golang Compiler Testing
    Gu, Qiuhan
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 2201 - 2203
  • [7] Evaluation of LLM-based chatbots for OSINT-based Cyber Threat Awareness
    Shafee, Samaneh
    Bessani, Alysson
    Ferreira, Pedro M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 261
  • [8] Balancing Efficiency and Quality in LLM-Based Entity Resolution on Structured Data
    Nananukul, Navapat
    Kekriwal, Mayank
    SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2024, PT III, 2025, 15213 : 278 - 293
  • [9] Data-efficient Fine-tuning for LLM-based Recommendation
    Lin, Xinyu
    Wang, Wenjie
    Li, Yongqi
    Yang, Shuo
    Feng, Fuli
    Wei, Yinwei
    Chua, Tat-Seng
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 365 - 374
  • [10] Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
    Deng, Shihan
    Xu, Weikai
    Sun, Hongda
    Liu, Wei
    Tang, Tao
    Liu, Jianfeng
    Li, Ang
    Luan, Jian
    Wang, Bin
    Yan, Rui
    Shang, Shuo
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 8813 - 8831