Staged Multi-Strategy Framework With Open-Source Large Language Models for Natural Language to SQL Generation

被引:0
|
作者
Liu, Chuanlong [1 ]
Liao, Wei [1 ]
Xu, Zhen [2 ]
机构
[1] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Shanghai 201620, Peoples R China
[2] Shanghai Univ Engn Sci, Sch Mech & Automot Engn, Shanghai 201620, Peoples R China
关键词
open-source large language models; pre-trained language models; natural language to sql; prompt learning;
D O I
10.1002/tee.24268
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the field of natural language to SQL (NL2SQL), significant progress has been made with large pre-trained language models. However, these models still have deficiencies in terms of their ability to generalize, particularly in open-source Large Language Models (LLMs). Additionally, most research efforts tend to overlook the impact of key column information and data table content on the accuracy of queries during the SQL statement generation process. In this paper, we propose a staged, multi-strategy framework called Key Columns and Table Contents (KCTC). The framework is divided into two stages. Firstly, it uses fixed prompt content to extract SQL key column information from natural language questions, including selected columns and conditioned columns. It also formats the output of column information. Secondly, it combines variable prompt content to guide the model in generating SQL statements. It uses the content of the data table for constraints to reduce the impact of errors in condition values on SQL statements. We conducted experiments on the Chinese dataset TableQA using several open-source LLMs. The results demonstrate that our method significantly improved the execution accuracy of SQL statements, with an average increase of 60.29% and reaching up to 91.22% accuracy. This result validates the effectiveness of our approach. (c) 2025 Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Re: Open-Source Large Language Models in Radiology
    Kooraki, Soheil
    Bedayat, Arash
    ACADEMIC RADIOLOGY, 2024, 31 (10) : 4293 - 4293
  • [2] Servicing open-source large language models for oncology
    Ray, Partha Pratim
    ONCOLOGIST, 2024,
  • [3] A tutorial on open-source large language models for behavioral science
    Hussain, Zak
    Binz, Marcel
    Mata, Rui
    Wulff, Dirk U.
    BEHAVIOR RESEARCH METHODS, 2024, 56 (08) : 8214 - 8237
  • [4] Upgrading Academic Radiology with Open-Source Large Language Models
    Ray, Partha Pratim
    ACADEMIC RADIOLOGY, 2024, 31 (10) : 4291 - 4292
  • [5] EAI-SIM: An Open-source Embodied AI Simulation Framework with Large Language Models
    Liu, Guocai
    Sun, Tao
    Li, Weihua
    Li, Xiaohui
    Liu, Xin
    Cui, Jinqiang
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024, 2024, : 994 - 999
  • [6] ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models
    Feuer, Benjamin
    Liu, Yurong
    Hegde, Chinmay
    Freire, Juliana
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (09): : 2279 - 2292
  • [7] Natural Language Dataset Generation Framework for Visualizations Powered by Large Language Models
    Ko, Hyung-Kwon
    Jeon, Hyeon
    Park, Gwanmo
    Kim, Dae Hyun
    Kim, Nam Wook
    Kim, Juho
    Seo, Jinwook
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [8] Preliminary Systematic Review of Open-Source Large Language Models in Education
    Lin, Michael Pin-Chuan
    Chang, Daniel
    Hall, Sarah
    Jhajj, Gaganpreet
    GENERATIVE INTELLIGENCE AND INTELLIGENT TUTORING SYSTEMS, PT I, ITS 2024, 2024, 14798 : 68 - 77
  • [9] Classifying Cancer Stage with Open-Source Clinical Large Language Models
    Chang, Chia-Hsuan
    Lucas, Mary M.
    Lu-Yao, Grace
    Yang, Christopher C.
    2024 IEEE 12TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS, ICHI 2024, 2024, : 76 - 82
  • [10] RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model
    Lu, Yao
    Liu, Shang
    Zhang, Qijun
    Xie, Zhiyao
    29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 722 - 727