Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models

被引:218
|
作者
Vaithilingam, Priyan [1 ]
Zhang, Tianyi [2 ]
Glassman, Elena L. [1 ]
机构
[1] Harvard Univ, Cambridge, MA 02138 USA
[2] Purdue Univ, W Lafayette, IN 47907 USA
关键词
large language model; github copilot;
D O I
10.1145/3491101.3519665
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent advances in Large Language Models (LLM) have made automatic code generation possible for real-world programming tasks in general-purpose programming languages such as Python. However, there are few human studies on the usability of these tools and how they fit the programming workflow. In this work, we conducted a within-subjects user study with 24 participants to understand how programmers use and perceive Copilot, a LLM-based code generation tool. We found that, while Copilot did not necessarily improve the task completion time or success rate, most participants preferred to use Copilot in daily programming tasks, since Copilot often provided a useful starting point and saved the effort of searching online. However, participants did face difficulties in understanding, editing, and debugging code snippets generated by Copilot, which significantly hindered their task-solving effectiveness. Finally, we highlighted several promising directions for improving the design of Copilot based on our observations and participants' feedback.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Benchmarking Large Language Models for Automated Verilog RTL Code Generation
    Thakur, Shailja
    Ahmad, Baleegh
    Fan, Zhenxing
    Pearce, Hammond
    Tan, Benjamin
    Karri, Ramesh
    Dolan-Gavitt, Brendan
    Garg, Siddharth
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [22] On the Effectiveness of Large Language Models in Domain-Specific Code Generation
    Gu, Xiaodong
    Chen, Meng
    Lin, Yalan
    Hu, Yuhan
    Zhang, Hongyu
    Wan, Chengcheng
    Wei, Zhao
    Xu, Yong
    Wang, Juhong
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2025, 34 (03)
  • [23] Multi-stage guided code generation for Large Language Models
    Han, Yewei
    Lyu, Chen
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 139
  • [24] Evaluating Large Language Models for Automated CPT Code Prediction in Endovascular Neurosurgery
    Roy, Joanna M.
    Self, D. Mitchell
    Isch, Emily
    Musmar, Basel
    Lan, Matthews
    Keppetipola, Kavantissa
    Koduri, Sravanthi
    Pontarelli, Mary-Katharine
    Tjoumakaris, Stavropoula I.
    Gooch, M. Reid
    Rosenwasser, Robert H.
    Jabbour, Pascal M.
    JOURNAL OF MEDICAL SYSTEMS, 2025, 49 (01)
  • [25] Evaluating Large Language Models for G-Code Debugging, Manipulation, and Comprehension
    Jignasu, Anushrut
    Marshall, Kelly
    Ganapathysubramanian, Baskar
    Balu, Aditya
    Hegde, Chinmay
    Krishnamurthy, Adarsh
    2024 IEEE LLM AIDED DESIGN WORKSHOP, LAD 2024, 2024,
  • [26] Evaluating application of large language models to biomedical patent claim generation
    Chen, Feng-Chi
    Pan, Chia-Lin
    AIPlux Development Team, AIPlux Development
    WORLD PATENT INFORMATION, 2025, 80
  • [27] Humans vs large language models: An assessment of evaluating online dermatological misinformation
    Fanous, A. H.
    Le, M.
    Rezaei, S.
    Xu, S.
    Ko, J.
    Lipoff, J.
    Daneshjou, R.
    JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2024, 144 (08) : S130 - S130
  • [28] Code-level quantum circuit generation based on large language models
    He, Zhimin
    Li, Guohong
    Situ, Haozhen
    Zhou, Yan
    Zheng, Shenggen
    Li, Lvzhou
    SCIENTIA SINICA-PHYSICA MECHANICA & ASTRONOMICA, 2025, 55 (04)
  • [29] FormalEval: A Method for Automatic Evaluation of Code Generation via Large Language Models
    Yang, Sichao
    Yang, Ye
    2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 660 - 665
  • [30] Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models
    Sarsa, Sami
    Denny, Paul
    Hellas, Arto
    Leinonen, Juho
    PROCEEDINGS OF THE 2022 ACM CONFERENCE ON INTERNATIONAL COMPUTING EDUCATION RESEARCH, ICER 2022, VOL. 1, 2023, : 27 - 43