A Heterogeneous Graph to Abstract Syntax Tree Framework for Text-to-SQL

被引:2
|
作者
Cao, Ruisheng [1 ]
Chen, Lu [1 ]
Li, Jieyu [1 ]
Zhang, Hanchong [1 ]
Xu, Hongshen [1 ]
Zhang, Wangyou [1 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, X LANCE Lab, MoE Key Lab Artificial Intelligence, Dept Comp Sci & Engn,AI Inst, Shanghai 200240, Peoples R China
关键词
Structured Query Language; Decoding; Databases; Syntactics; Semantics; Task analysis; Computational modeling; Abstract syntax tree; grammar-based constrained decoding; heterogeneous graph neural network; knowledge-driven natural language processing; permutation invariant problem; text; -to-SQL;
D O I
10.1109/TPAMI.2023.3298895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-to-SQL is the task of converting a natural language utterance plus the corresponding database schema into a SQL program. The inputs naturally form a heterogeneous graph while the output SQL can be transduced into an abstract syntax tree (AST). Traditional encoder-decoder models ignore higher-order semantics in heterogeneous graph encoding and introduce permutation biases during AST construction, thus incapable of exploiting the refined structure knowledge precisely. In this work, we propose a generic heterogeneous graph to abstract syntax tree (HG2AST) framework to integrate dedicated structure knowledge into statistics-based models. On the encoder side, we leverage a line graph enhanced encoder (LGESQL) to iteratively update both node and edge features through dual graph message passing and aggregation. On the decoder side, a grammar-based decoder first constructs the equivalent SQL AST and then transforms it into the desired SQL via post-processing. To avoid over-fitting permutation biases, we propose a golden tree-oriented learning (GTL) algorithm to adaptively control the expanding order of AST nodes. The graph encoder and tree decoder are combined into a unified framework through two auxiliary modules. Extensive experiments on various text-to-SQL datasets, including single/multi-table, single/cross-domain, and multilingual settings, demonstrate the superiority and broad applicability.
引用
收藏
页码:13796 / 13813
页数:18
相关论文
共 50 条
  • [1] SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-Domain Text-to-SQL Task
    Yu, Tao
    Yasunaga, Michihiro
    Yang, Kai
    Zhang, Rui
    Wang, Dongxu
    Li, Zifan
    Radev, Dragomir R.
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1653 - 1663
  • [2] S2SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers
    Hui, Binyuan
    Geng, Ruiying
    Wang, Lihan
    Qin, Bowen
    Li, Yanyang
    Li, Bowen
    Sun, Jian
    Li, Yongbin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1254 - 1262
  • [3] Graph Reasoning Enhanced Language Models for Text-to-SQL
    Gong, Zheng
    Sun, Ying
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2447 - 2451
  • [4] ShadowGNN Graph Projection Neural Network for Text-to-SQL Parser
    Chen, Zhi
    Chen, Lu
    Zhao, Yanbin
    Cao, Ruisheng
    Xu, Zihan
    Zhu, Su
    Yu, Kai
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5567 - 5577
  • [5] RECPARSER: A Recursive Semantic Parsing Framework for Text-to-SQL Task
    Zeng, Yu
    Gao, Yan
    Guo, Jiaqi
    Chen, Bei
    Liu, Qian
    Lou, Jian-Guang
    Teng, Fei
    Zhang, Dongmei
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3644 - 3650
  • [6] ReFSQL: A Retrieval-Augmentation Framework for Text-to-SQL Generation
    Zhang, Kun
    Lin, Xiexiong
    Wang, Yuanzhuo
    Zhang, Xin
    Sun, Fei
    Cen, Jianhe
    Jiang, Xuhui
    Tan, Hexiang
    Shen, Huawei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 664 - 673
  • [7] Bridging the gap between text-to-SQL research and real-world applications: A unified all-in-one framework for text-to-SQL
    Han, Mirae
    Park, Seongsik
    Kim, Harksoo
    Kim, Seulgi
    KNOWLEDGE-BASED SYSTEMS, 2024, 306
  • [8] Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing
    Bogin, Ben
    Gardner, Matt
    Berant, Jonathan
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4560 - 4565
  • [9] On the Vulnerabilities of Text-to-SQL Models
    Peng, Xutan
    Zhang, Yipeng
    Yang, Jingfeng
    Stevenson, Mark
    2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE, 2023, : 1 - 12
  • [10] An interaction-modeling mechanism for context-dependent Text-to-SQL translation based on heterogeneous graph aggregation
    Yu, Wei
    Chang, Tao
    Guo, Xiaoting
    Wang, Mengzhu
    Wang, Xiaodong
    NEURAL NETWORKS, 2021, 142 : 573 - 582