Relationalizing Tables with Large Language Models: The Promise and Challenges

被引:0
|
作者
Huang, Zezhou [1 ]
Wu, Eugene [2 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Columbia Univ, DSI, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
Large Language Model; Data Transformation; Prompt Engineering; Data Management;
D O I
10.1109/ICDEW61823.2024.00045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation operators, and deep neural networks were adopted to automatically find the sequence of operators, achieving an accuracy of 57.0%. In comparison, earlier versions of large language models like GPT-3.5 only reached 13.1%. However, these results were obtained using naive prompts. Furthermore, GPT-4 is recently available, which is substantially larger and more performant. This study examines how the selection of models, specifically GPT-3.5 and GPT-4, and various prompting strategies, such as Chain-of-Thought and task decomposition, affect accuracy. The main finding is that GPT-4, combined with Task Decomposition and Chain-of-Thought, attains a remarkable accuracy of 74.6%. Further analysis of errors made by GPT-4 shows the challenges that about half of the errors are not due to the model's shortcomings, but rather to ambiguities in the benchmarks. When these benchmarks are disambiguated, GPT-4's accuracy improves to 86.9%.
引用
收藏
页码:305 / 309
页数:5
相关论文
共 50 条
  • [31] On the Interaction with Large Language Models for Web Accessibility: Implications and Challenges
    Delnevo, Giovanni
    Andruccioli, Manuel
    Mirri, Silvia
    2024 IEEE 21ST CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE, CCNC, 2024,
  • [32] Challenges and Opportunities of Moderating Usage of Large Language Models in Education
    Krupp, Lars
    Steinert, Steffen
    Kiefer-Emmanouilidis, Maximilian
    Avila, Karina E.
    Lukowicz, Paul
    Kuhn, Jochen
    Kuechemann, Stefan
    Karolus, Jakob
    AI FOR EDUCATION WORKSHOP, 2024, 257 : 9 - 17
  • [33] Challenges in applying large language models to requirements engineering tasks
    Norheim, Johannes J.
    Rebentisch, Eric
    Xiao, Dekai
    Draeger, Lorenz
    Kerbrat, Alain
    de Weck, Olivier L.
    DESIGN SCIENCE, 2024, 10
  • [34] Large Language Models for Business Process Management: Opportunities and Challenges
    Vidgof, Maxim
    Bachhofner, Stefan
    Mendling, Jan
    BUSINESS PROCESS MANAGEMENT FORUM, BPM 2023 FORUM, 2023, 490 : 107 - 123
  • [35] Beyond the hype: Unveiling the challenges of large language models in urology
    Kwong, Jethro C. C.
    Nguyen, David-Dan
    Khondker, Adree
    Li, Tiange
    CUAJ-CANADIAN UROLOGICAL ASSOCIATION JOURNAL, 2024, 18 (10): : 333 - 334
  • [37] Large language models in health care: Development, applications, and challenges
    Yang, Rui
    Tan, Ting Fang
    Lu, Wei
    Thirunavukarasu, Arun James
    Ting, Daniel Shu Wei
    Liu, Nan
    HEALTH CARE SCIENCE, 2023, 2 (04): : 255 - 263
  • [38] The rise of large language models: challenges for Critical Discourse Studies
    Gillings, Mathew
    Kohn, Tobias
    Mautner, Gerlinde
    CRITICAL DISCOURSE STUDIES, 2024,
  • [39] Large Language Models and Future of Information Retrieval: Opportunities and Challenges
    Zhai, ChengXiang
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 481 - 490
  • [40] Methodological Challenges in Evaluating Large Language Models in Radiology Response
    Krishna, Satheesh
    Bhayana, Rajesh
    RADIOLOGY, 2024, 313 (03) : 1 - 2