Cleaning Antipatterns in an SQL Query Log

被引:12
|
作者
Arzamasova, Natalia [1 ]
Schaeler, Martin [2 ]
Bohm, Klemens [3 ]
机构
[1] Karlsruhe Inst Technol, D-76131 Karlsruhe, Germany
[2] Karlsruhe Inst Technol, Databases & Informat Syst Grp, D-76131 Karlsruhe, Germany
[3] Karlsruhe Inst Technol, Databases & Informat Syst, D-76131 Karlsruhe, Germany
关键词
SQL log analysis; patterns and antipatterns; data preprocessing; E-SCIENCE;
D O I
10.1109/TKDE.2017.2772252
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today, many scientific data sets are open to the public. For their operators, it is important to know what the users are interested in. In this paper, we study the problem of extracting and analyzing patterns from the query log of a database. We focus on design errors (antipatterns), which typically lead to unnecessary SQL statements. Such antipatterns do not only have a negative effect on performance. They also introduce bias on any subsequent analysis of the SQL log. We propose a framework designed to discover patterns and antipatterns in arbitrary SQL query logs and to clean antipatterns. To study the usefulness of our approach and to reveal insights regarding the existence of antipatterns in real-world systems, we examine the SQL log of the SkyServer project, containing more than 40 million queries. Among the top 15 patterns, we have found six antipatterns. This result as well as other ones gives way to the conclusion that antipatterns might falsify refactoring and any other downstream analyses.
引用
收藏
页码:421 / 434
页数:14
相关论文
共 50 条
  • [31] SQL query extensions for imprecise questions
    Le Guilly, Marie
    Petit, Jean-Marc
    Scuturici, Vasile-Marian
    Data and Knowledge Engineering, 2022, 137
  • [32] Facilitating SQL Query Composition and Analysis
    Zolaktaf, Zainab
    Milani, Mostafa
    Pottinger, Rachel
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 209 - 224
  • [33] Qex: Symbolic SQL Query Explorer
    Veanes, Margus
    Tillmann, Nikolai
    de Halleux, Jonathan
    LOGIC FOR PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND REASONING (LPAR-16), 2010, 6355 : 425 - 446
  • [34] Learning to Mine Query Subtopics from Query Log
    Zhang, Zhenzhong
    Sun, Le
    Han, Xianpei
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 341 - 345
  • [35] Mining query log graphs towards a query folksonomy
    Francisco, Alexandre P.
    Baeza-Yates, Ricardo
    Oliveira, Arlindo L.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (17): : 2179 - 2192
  • [36] QUERY INTENT DETECTION BASED ON QUERY LOG MINING
    Zamora, Juan
    Mendoza, Marcelo
    Allende, Hector
    JOURNAL OF WEB ENGINEERING, 2014, 13 (1-2): : 24 - 52
  • [37] Detection of SQL Injection Attacks by Removing the Parameter Values of SQL Query
    Katole, Rajashree A.
    Sherekar, Swati S.
    Thakare, Vilas M.
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INVENTIVE SYSTEMS AND CONTROL (ICISC 2018), 2018, : 736 - 741
  • [38] Use of Visual Query System in teaching database query language SQL
    Chen, PK
    Sheen, CY
    Chen, GD
    ADVANCED RESEARCH IN COMPUTERS AND COMMUNICATIONS IN EDUCATION, VOL 1: NEW HUMAN ABILITIES FOR THE NETWORKED SOCIETY, 1999, 55 : 800 - 807
  • [39] Query Expansion Based on Query Log and Small World Characteristic
    Cao, Yujuan
    Peng, Xueping
    Kun, Zhao
    Niu, Zhendong
    Xu, Gx
    Wang, Weiqiang
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2009, PROCEEDINGS, 2009, 5802 : 573 - +
  • [40] XML-SQL: An XML query language based on SQL and path tables
    Pankowski, T
    XML-BASED DATA MANAGEMENT AND MULTIMEDIA ENGINEERING-EDBT 2002 WORKSHOPS, 2002, 2490 : 184 - 209