Why Johnny Can't Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts

被引：227

作者：

Zamfrescu-Pereira, J. D. ^{[1
]}

Wong, Richmond ^{[2
]}

Hartmann, Bjoern ^{[1
]}

Yang, Qian ^{[3
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Georgia Inst Technol, Atlanta, GA 30332 USA

[3] Cornell Univ, Ithaca, NY USA

来源：

PROCEEDINGS OF THE 2023 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2023) | 2023年

关键词：

language models; end-users; design tools;

D O I：

10.1145/3544548.3581388

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Pre-trained large language models ("LLMs") like GPT-3 can engage in fluent, multi-turn instruction-taking out-of-the-box, making them attractive materials for designing natural language interactions. Using natural language to steer LLM outputs ("prompting") has emerged as an important design technique potentially accessible to non-AI-experts. Crafting effective prompts can be challenging, however, and prompt-based interactions are brittle. Here, we explore whether non-AI-experts can successfully engage in "end-user prompt engineering" using a design probe-a prototype LLM-based chatbot design tool supporting development and systematic evaluation of prompting strategies. Ultimately, our probe participants explored prompt designs opportunistically, not systematically, and struggled in ways echoing end-user programming systems and interactive machine learning systems. Expectations stemming from human-to-human instructional experiences, and a tendency to overgeneralize, were barriers to effective prompt design. These findings have implications for non-AI-expert-facing LLM-based tool design and for improving LLM-and-prompt literacy among programmers and the public, and present opportunities for further research.

引用

页数：21