Relationalizing Tables with Large Language Models: The Promise and Challenges

被引:0
|
作者
Huang, Zezhou [1 ]
Wu, Eugene [2 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Columbia Univ, DSI, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
Large Language Model; Data Transformation; Prompt Engineering; Data Management;
D O I
10.1109/ICDEW61823.2024.00045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation operators, and deep neural networks were adopted to automatically find the sequence of operators, achieving an accuracy of 57.0%. In comparison, earlier versions of large language models like GPT-3.5 only reached 13.1%. However, these results were obtained using naive prompts. Furthermore, GPT-4 is recently available, which is substantially larger and more performant. This study examines how the selection of models, specifically GPT-3.5 and GPT-4, and various prompting strategies, such as Chain-of-Thought and task decomposition, affect accuracy. The main finding is that GPT-4, combined with Task Decomposition and Chain-of-Thought, attains a remarkable accuracy of 74.6%. Further analysis of errors made by GPT-4 shows the challenges that about half of the errors are not due to the model's shortcomings, but rather to ambiguities in the benchmarks. When these benchmarks are disambiguated, GPT-4's accuracy improves to 86.9%.
引用
收藏
页码:305 / 309
页数:5
相关论文
共 50 条
  • [41] Opportunities and challenges for ChatGPT and large language models in biomedicine and health
    Tian, Shubo
    Jin, Qiao
    Yeganova, Lana
    Lai, Po-Ting
    Zhu, Qingqing
    Chen, Xiuying
    Yang, Yifan
    Chen, Qingyu
    Kim, Won
    Comeau, Donald C.
    Islamaj, Rezarta
    Kapoor, Aadit
    Gao, Xin
    Lu, Zhiyong
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (01)
  • [42] ChatGPT for good? On opportunities and challenges of large language models for education
    Kasneci, Enkelejda
    Sessler, Kathrin
    Kuechemann, Stefan
    Bannert, Maria
    Dementieva, Daryna
    Fischer, Frank
    Gasser, Urs
    Groh, Georg
    Guennemann, Stephan
    Huellermeier, Eyke
    Krusche, Stepha
    Kutyniok, Gitta
    Michaeli, Tilman
    Nerdel, Claudia
    Pfeffer, Juergen
    Poquet, Oleksandra
    Sailer, Michael
    Schmidt, Albrecht
    Seidel, Tina
    Stadler, Matthias
    Weller, Jochen
    Kuhn, Jochen
    Kasneci, Gjergji
    LEARNING AND INDIVIDUAL DIFFERENCES, 2023, 103
  • [43] Navigating Challenges and Technical Debt in Large Language Models Deployment
    Menshawy, Ahmed
    Nawaz, Zeeshan
    Fahmy, Mahmoud
    PROCEEDINGS OF THE 2024 4TH WORKSHOP ON MACHINE LEARNING AND SYSTEMS, EUROMLSYS 2024, 2024, : 192 - 199
  • [44] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
    Myers, Devon
    Mohawesh, Rami
    Chellaboina, Venkata Ishwarya
    Sathvik, Anantha Lakshmi
    Venkatesh, Praveen
    Ho, Yi-Hui
    Henshaw, Hanna
    Alhawawreh, Muna
    Berdik, David
    Jararweh, Yaser
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 1 - 26
  • [45] Large language models for whole-learner support: opportunities and challenges
    Mannekote, Amogh
    Davies, Adam
    Pinto, Juan D.
    Zhang, Shan
    Olds, Daniel
    Schroeder, Noah L.
    Lehman, Blair
    Zapata-Rivera, Diego
    Zhai, Chengxiang
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [46] Large language models overcome the challenges of unstructured text data in ecology
    Castro, Andry
    Pinto, Joao
    Reino, Luis
    Pipek, Pavel
    Capinha, Cesar
    ECOLOGICAL INFORMATICS, 2024, 82
  • [47] Possibilities and challenges in the moral growth of large language models: a philosophical perspective
    Wang, Guoyu
    Wang, Wei
    Cao, Yiqin
    Teng, Yan
    Guo, Qianyu
    Wang, Haofen
    Lin, Junyu
    Ma, Jiajie
    Liu, Jin
    Wang, Yingchun
    ETHICS AND INFORMATION TECHNOLOGY, 2025, 27 (01)
  • [48] Large language models in medical and healthcare fields: applications, advances, and challenges
    Wang, Dandan
    Zhang, Shiqing
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (11)
  • [49] Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions
    Abd-alrazaq, Alaa
    AlSaad, Rawan
    Alhuwail, Dari
    Ahmed, Arfan
    Healy, Padraig Mark
    Latifi, Syed
    Aziz, Sarah
    Damseh, Rafat
    Alrazak, Sadam Alabed
    Sheikh, Javaid
    JMIR MEDICAL EDUCATION, 2023, 9
  • [50] The usefulness of large language models for patient information on melanoma: challenges and opportunities
    Sangers, Tobias E.
    van Doorn, Remco
    BRITISH JOURNAL OF DERMATOLOGY, 2024, 192 (02) : 185 - 185