Optimizing Large Language Models: A Deep Dive into Effective Prompt Engineering Techniques

被引:0
|
作者
Son, Minjun [1 ]
Won, Yun-Jae [2 ]
Lee, Sungjin [3 ]
机构
[1] Sungkyunkwan Univ, Dept MetabioHlth, Suwon 16419, South Korea
[2] Korea Elect Technol Inst, Seongnam 13488, South Korea
[3] Soonchunhyang Univ, Dept Smart Automot, Asan 31538, South Korea
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 03期
关键词
large language model; prompt engineering; in-context learning; chain of thought; retrieval-augmented generation; step-by-step reasoning; tree of thought;
D O I
10.3390/app15031430
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Recent advancements in Natural Language Processing (NLP) technologies have been driven at an unprecedented pace by the development of Large Language Models (LLMs). However, challenges remain, such as generating responses that are misaligned with the intent of the question or producing incorrect answers. This paper analyzes various Prompt Engineering techniques for large-scale language models and identifies methods that can optimize response performance across different datasets without the need for extensive retraining or fine-tuning. In particular, we examine prominent Prompt Engineering techniques including In-Context Learning (ICL), Chain of Thought (CoT), Retrieval-Augmented Generation (RAG), Step-by-Step Reasoning (SSR), and Tree of Thought (ToT), and we apply these techniques to leading LLMs such as Gemma2, LlaMA3, and Mistral. The performance of these models was evaluated using the AI2 Reasoning Challenge (ARC), HellaSwag, Massive Multitask Language Understanding (MMLU), TruthfulQA, Winogrande, and Grade School Math (GSM8k) datasets across metrics such as BLEU, ROUGE, METEOR, BLEURT, and BERTScore. The experimental results indicate that the most suitable Prompt Engineering technique can vary depending on the characteristics of each dataset. Specifically, for datasets emphasizing mathematical and logical reasoning, Prompt Engineering strategies centered around CoT, SSR, and ToT were found to be advantageous. For datasets focusing on natural language understanding, ICL-centric strategies were more effective, while RAG-based strategies were beneficial for datasets where factual accuracy is crucial. However, it was also observed that the optimal combination of Prompt Engineering techniques could differ depending on the specific LLM, indicating that fine-tuning the Prompt Engineering approach to the model and dataset is essential for achieving the best performance. The findings indicate that as LLMs become more advanced, their reliance on Prompt Engineering (PE) techniques diminishes, yet the magnitude of their performance improvement when PE strategies are applied increases. Furthermore, these advanced models tend to depend less on ICL techniques while exhibiting a greater reliance on RAG strategies. It is also evident that implementing RAG with PE-based preprocessing yields superior performance enhancements compared to the mere application of RAG on raw data.
引用
收藏
页数:32
相关论文
共 50 条
  • [31] Prompt text classifications with transformer models! An exemplary introduction to prompt-based learning with large language models
    Mayer, Christian W. F.
    Ludwig, Sabrina
    Brandt, Steffen
    JOURNAL OF RESEARCH ON TECHNOLOGY IN EDUCATION, 2023, 55 (01) : 125 - 141
  • [32] Leveraging Large Language Models with Chain-of-Thought and Prompt Engineering for Traffic Crash Severity Analysis and Inference
    Zhen, Hao
    Shi, Yucheng
    Huang, Yongcan
    Yang, Jidong J.
    Liu, Ninghao
    COMPUTERS, 2024, 13 (09)
  • [33] Investigating the Impact of Prompt Engineering on the Performance of Large Language Models for Standardizing Obstetric Diagnosis Text: Comparative Study
    Wang, Lei
    Bi, Wenshuai
    Zhao, Suling
    Ma, Yinyao
    Lv, Longting
    Meng, Chenwei
    Fu, Jingru
    Lv, Hanlin
    JMIR FORMATIVE RESEARCH, 2024, 8
  • [34] Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint Descriptions
    Voigt, Henrik
    Hombeck, Jan
    Meuschke, Monique
    Lawonn, Kai
    Zarriess, Sina
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 828 - 843
  • [35] StableYolo: Optimizing Image Generation for Large Language Models
    Berger, Harel
    Dakhama, Aidan
    Ding, Zishuo
    Even-Mendoza, Karine
    Kelly, David
    Menendez, Hector
    Moussa, Rebecca
    Sarro, Federica
    SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2023, 2024, 14415 : 133 - 139
  • [36] Simulating Patient Encounters With Generative Artificial Intelligence Effective Prompt Engineering Techniques
    Hickman, Christopher
    Jennings, Matthew
    Smith, Tedra S.
    NURSE EDUCATOR, 2025, 50 (01) : E53 - E54
  • [37] The Effect of Prompt Types on Text Summarization Performance With Large Language Models
    Borhan, Iffat
    Bajaj, Akhilesh
    JOURNAL OF DATABASE MANAGEMENT, 2024, 35 (01)
  • [38] IAPT: Instruction-Aware Prompt Tuning for Large Language Models
    Zhu, Wei
    Tian, Aaron Xuxiang
    Yin, Congrui
    Ni, Yuan
    Wang, Xiaoling
    Xie, Guotong
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 14285 - 14304
  • [39] Attack Prompt Generation for Red Teaming and Defending Large Language Models
    Deng, Boyi
    Wang, Wenjie
    Feng, Fuli
    Deng, Yang
    Wang, Qifan
    He, Xiangnan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2176 - 2189
  • [40] Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations
    Pham, Minh-Quang
    Indurthi, Sathish Reddy
    Chollampatt, Shamil
    Turchi, Marco
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12257 - 12265