Towards Enhancing Database Education: Natural Language Generation Meets Query Execution Plans

被引：11

作者：

Wang, Weiguo ^{[1
,2
]}

Bhowmick, Sourav S. ^{[1
]}

Li, Hui ^{[2
]}

Joty, Shafiq ^{[1
]}

Liu, Siyuan ^{[1
]}

Chen, Peng ^{[2
]}

机构：

[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore

[2] Xidian Univ, Sch Cyber Engn, Xian, Peoples R China

来源：

SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA | 2021年

基金：

中国国家自然科学基金;

关键词：

REPETITION; BOREDOM;

D O I：

10.1145/3448016.3452822

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The database systems course is offered as part of an undergraduate computer science degree program in many major universities. A key learning goal of learners taking such a course is to understand how SQL queries are processed in a RDBMS in practice. Since a query execution plan (QEP) describes the execution steps of a query, learners can acquire the understanding by perusing the QEPS generated by a RDBMS. Unfortunately, in practice, it is often daunting for a learner to comprehend these QEPS containing vendor-specific implementation details, hindering her learning process. In this paper, we present a novel, end-to-end, generic system called LANTERN that generates a natural language description of a QEP to facilitate understanding of the query execution steps. It takes as input an SQL query and its QEP, and generates a natural language description of the execution strategy deployed by the underlying RDBMS. Specifically, it deploys a declarative framework called POOL that enables subject matter experts to efficiently create and maintain natural language descriptions of physical operators used in QEPS. A rule-based framework called RULE-LANTERN is proposed that exploits POOL to generate natural language descriptions of QEPS. Despite the high accuracy of RULE-LANTERN, our engagement with learners reveal that, consistent with existing psychology theories, perusing such rule-based descriptions lead to boredom due to repetitive statements across different QEPS. To address this issue, we present a novel deep learning-based language generation framework called NEURAL-LANTERN that infuses language variability in the generated description by exploiting a set of paraphrasing tools and word embedding. Our experimental study with real learners shows the effectiveness of LANTERN in facilitating comprehension of QEPS.

引用

页码：1933 / 1945

页数：13

共 50 条

[41] Insights into Natural Language Database Query Errors: from Attention Misalignment to User Handling Strategies
Ning, Zheng
Tian, Yuan
Zhang, Zheng
Zhang, Tianyi
Jia-Jun Li, Toby
ACM Transactions on Interactive Intelligent Systems, 2024, 14 (04)
[42] Towards natural language question generation for the validation of ontologies and mappings
Ben Abacha, Asma
Dos Reis, Julio Cesar
Mrabet, Yassine
Pruski, Cedric
Da Silveira, Marcos
JOURNAL OF BIOMEDICAL SEMANTICS, 2016, 7
[43] Towards natural language question generation for the validation of ontologies and mappings
Asma Ben Abacha
Julio Cesar Dos Reis
Yassine Mrabet
Cédric Pruski
Marcos Da Silveira
Journal of Biomedical Semantics, 7
[44] Towards Multilingual Natural Language Generation Within Abstractive Summarization
Mille, Simon
Ballesteros, Miguel
Burga, Alicia
Casamayor, Gerard
Wanner, Leo
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2016, 288 : 309 - 314
[45] Is ChatGPT a Good Geospatial Data Analyst? Exploring the Integration of Natural Language into Structured Query Language within a Spatial Database
Jiang, Yongyao
Yang, Chaowei
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (01)
[46] Optimizing Interpretation Generation in Natural Language Query Answering for Real Time End Users
Sen, Jaydeep
Saha, Diptikalyan
Mittal, Ashish
Sankaranarayanan, Karthik
CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 341 - 349
[47] ASPECTS OF THE AUTOMATIC-GENERATION OF SQL STATEMENTS IN A NATURAL-LANGUAGE QUERY INTERFACE
OTT, N
INFORMATION SYSTEMS, 1992, 17 (02) : 147 - 159
[48] A DATABASE-DOMAIN HIERARCHY-BASED TECHNIQUE FOR HANDLING UNKNOWN TERMS IN NATURAL-LANGUAGE DATABASE QUERY INTERFACES
TRABELSI, Z
KOTANI, Y
TAKIGUCHI, N
NISIMURA, H
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1993, E76D (06) : 668 - 679
[49] Towards Diet Management with Automatic Reasoning and Persuasive Natural Language Generation
Anselma, Luca
Mazzei, Alessandro
PROGRESS IN ARTIFICIAL INTELLIGENCE-BK, 2015, 9273 : 79 - 90
[50] Executable Test Case Generation from Specifications Written in Natural Language and Test Execution Environment
Aoyama, Yusuke
Kuroiwa, Takeru
Kushiro, Noriyuki
2021 IEEE 18TH ANNUAL CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE (CCNC), 2021,

← 1 2 3 4 5 →