Prompt Optimization in Large Language Models

被引:3
|
作者
Sabbatella, Antonio [1 ]
Ponti, Andrea [2 ]
Giordani, Ilaria [3 ]
Candelieri, Antonio [2 ]
Archetti, Francesco [1 ]
机构
[1] Univ Milano Bicocca, Dept Comp Sci Syst & Commun, I-20126 Milan, Italy
[2] Univ Milano Bicocca, Dept Econ Management & Stat, I-20126 Milan, Italy
[3] Oaks srl, I-20125 Milan, Italy
关键词
Bayesian Optimization; prompt optimization; black-box Large Language Models;
D O I
10.3390/math12060929
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Prompt optimization is a crucial task for improving the performance of large language models for downstream tasks. In this paper, a prompt is a sequence of n-grams selected from a vocabulary. Consequently, the aim is to select the optimal prompt concerning a certain performance metric. Prompt optimization can be considered as a combinatorial optimization problem, with the number of possible prompts (i.e., the combinatorial search space) given by the size of the vocabulary (i.e., all the possible n-grams) raised to the power of the length of the prompt. Exhaustive search is impractical; thus, an efficient search strategy is needed. We propose a Bayesian Optimization method performed over a continuous relaxation of the combinatorial search space. Bayesian Optimization is the dominant approach in black-box optimization for its sample efficiency, along with its modular structure and versatility. We use BoTorch, a library for Bayesian Optimization research built on top of PyTorch. Specifically, we focus on Hard Prompt Tuning, which directly searches for an optimal prompt to be added to the text input without requiring access to the Large Language Model, using it as a black-box (such as for GPT-4 which is available as a Model as a Service). Albeit preliminary and based on "vanilla" Bayesian Optimization algorithms, our experiments with RoBERTa as a large language model, on six benchmark datasets, show good performances when compared against other state-of-the-art black-box prompt optimization methods and enable an analysis of the trade-off between the size of the search space, accuracy, and wall-clock time.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Robust Prompt Optimization for Large Language Models Against Distribution Shifts
    Li, Moxin
    Wang, Wenjie
    Feng, Fuli
    Cao, Yixin
    Zhang, Jizhi
    Chua, Tat-Seng
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1539 - 1554
  • [2] WORDFLOW: Social Prompt Engineering for Large Language Models
    Wang, Zijie J.
    Chakravarthy, Aishwarya
    Munechika, David
    Chau, Duen Horng
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 42 - 50
  • [3] PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)
    Nazzal, Mahmoud
    Khalil, Issa
    Khreishah, Abdallah
    Phan, NhatHai
    CCS 2024 - Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security, : 2266 - 2279
  • [4] Black-Box Prompt Optimization: Aligning Large Language Models without Model Training
    Cheng, Jiale
    Liu, Xiao
    Zheng, Kehan
    Ke, Pei
    Wang, Hongning
    Dong, Yuxiao
    Tang, Jie
    Huang, Minlie
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 3201 - 3219
  • [5] Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization
    Che, Tianshi
    Liu, Ji
    Zhou, Yang
    Ren, Jiaxiang
    Zhou, Jiwen
    Sheng, Victor S.
    Dai, Huaiyu
    Dou, Dejing
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 7871 - 7888
  • [6] To prompt or not to prompt: Navigating the use of Large Language Models for integrating and modeling heterogeneous data
    Remadi, Adel
    El Hage, Karim
    Hobeika, Yasmina
    Bugiotti, Francesca
    DATA & KNOWLEDGE ENGINEERING, 2024, 152
  • [7] PromptMaker: Prompt-based Prototyping with Large Language Models
    Jiang, Ellen
    Olson, Kristen
    Toh, Edwin
    Molina, Alejandra
    Donsbach, Aaron
    Terry, Michael
    Cai, Carrie J.
    EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,
  • [8] Balancing Privacy and Robustness in Prompt Learning for Large Language Models
    Shi, Chiyu
    Su, Junyu
    Chu, Chiawei
    Wang, Baoping
    Feng, Duanyang
    MATHEMATICS, 2024, 12 (21)
  • [9] Response Generated by Large Language Models Depends on the Structure of the Prompt
    Sarangi, Pradosh Kumar
    Mondal, Himel
    INDIAN JOURNAL OF RADIOLOGY AND IMAGING, 2024, 34 (03): : 574 - 575
  • [10] DPO: Discrete Prompt Optimization for Vision-Language Models
    Liang, Nanhao
    Liu, Yong
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 671 - 675