LLMs for science: Usage for code generation and data analysis

被引：3

作者：

Nejjar, Mohamed ^{[1
]}

Zacharias, Luca ^{[1
]}

Stiehle, Fabian ^{[1
]}

Weber, Ingo ^{[1
,2
]}

机构：

[1] Tech Univ Munich, Sch Computat Informat & Technol, Munich, Germany

[2] Fraunhofer Gesell, Munich, Germany

来源：

JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS | 2025年 / 37卷 / 01期

关键词：

artificial intelligence; code generation; data analysis; GenAI4Science; large language models; LLMs4Science; research methods;

D O I：

10.1002/smr.2723

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Large language models (LLMs) have been touted to enable increased productivity in many areas of today's work life. Scientific research as an area of work is no exception: The potential of LLM-based tools to assist in the daily work of scientists has become a highly discussed topic across disciplines. However, we are only at the very onset of this subject of study. It is still unclear how the potential of LLMs will materialize in research practice. With this study, we give first empirical evidence on the use of LLMs in the research process. We have investigated a set of use cases for LLM-based tools in scientific research and conducted a first study to assess to which degree current tools are helpful. In this position paper, we report specifically on use cases related to software engineering, specifically, on generating application code and developing scripts for data analytics and visualization. While we studied seemingly simple use cases, results across tools differ significantly. Our results highlight the promise of LLM-based tools in general, yet we also observe various issues, particularly regarding the integrity of the output these tools provide.

引用

页数：7

共 50 条

[21] Will LLMs reshape, supercharge, or kill data science? (VLDB 2023 Panel)
Halevy, Alon
Choi, Yejin
Floratou, Avrilia
Franklin, Michael J.
Noy, Natasha
Wang, Haixun
PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (12): : 4114 - 4115
[22] Code Needs Comments: Enhancing Code LLMs with Comment Augmentation
Song, Demin
Guo, Honglin
Zhou, Yunhua
Xing, Shuhao
Wang, Yudong
Song, Zifan
Zhang, Wenwei
Guo, Qipeng
Yan, Hang
Qiu, Xipeng
Lin, Dahua
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 13640 - 13656
[23] Learning Preference Model for LLMs via Automatic Preference Data Generation
Huang, Shijia
Zhao, Jianqiao
Li, Yanyang
Wang, Liwei
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 9187 - 9199
[24] On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey
Long, Lin
Wang, Rui
Xiao, Ruixuan
Zhao, Junbo
Ding, Xiao
Chen, Gang
Wang, Haobo
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11065 - 11082
[25] PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)
Nazzal, Mahmoud
Khalil, Issa
Khreishah, Abdallah
Phan, NhatHai
CCS 2024 - Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security, : 2266 - 2279
[26] Anonymized Data: Generation, Models, Usage
Cormode, Graham
Srivastava, Divesh
26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 1211 - 1212
[27] Anonymized Data: Generation, Models, Usage
Cormode, Graham
Srivastava, Divesh
ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 1015 - 1018
[28] Exploring Metrics for the Analysis of Code Submissions in an Introductory Data Science Course
Huy Anh Nguyen
Lim, Michelle
Moore, Steven
Nyberg, Eric
Sakr, Majd
Stamper, John
LAK21 CONFERENCE PROCEEDINGS: THE ELEVENTH INTERNATIONAL CONFERENCE ON LEARNING ANALYTICS & KNOWLEDGE, 2021, : 632 - 638
[29] On Evaluating the Efficiency of Source Code Generated by LLMs
Niu, Changan
Zhang, Ting
Li, Chuanyi
Luo, Bin
Ng, Vincent
PROCEEDINGS 2024 IEEE/ACM FIRST INTERNATIONAL CONFERENCE ON AI FOUNDATION MODELS AND SOFTWARE ENGINEERING, FORGE 2024, 2024, : 103 - 107
[30] Qiskit Code Assistant: Training LLMs for generating Quantum Computing Code
Dupuis, Nicolas
Buratti, Luca
Vishwakarma, Sanjay
Forrat, Aitana Viudes
Kremer, David
Faro, Ismael
Puri, Ruchir
Cruz-Benito, Juan
2024 IEEE LLM AIDED DESIGN WORKSHOP, LAD 2024, 2024,

← 1 2 3 4 5 →