Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation operators, and deep neural networks were adopted to automatically find the sequence of operators, achieving an accuracy of 57.0%. In comparison, earlier versions of large language models like GPT-3.5 only reached 13.1%. However, these results were obtained using naive prompts. Furthermore, GPT-4 is recently available, which is substantially larger and more performant. This study examines how the selection of models, specifically GPT-3.5 and GPT-4, and various prompting strategies, such as Chain-of-Thought and task decomposition, affect accuracy. The main finding is that GPT-4, combined with Task Decomposition and Chain-of-Thought, attains a remarkable accuracy of 74.6%. Further analysis of errors made by GPT-4 shows the challenges that about half of the errors are not due to the model's shortcomings, but rather to ambiguities in the benchmarks. When these benchmarks are disambiguated, GPT-4's accuracy improves to 86.9%.