Cleaning Antipatterns in an SQL Query Log

被引:12
|
作者
Arzamasova, Natalia [1 ]
Schaeler, Martin [2 ]
Bohm, Klemens [3 ]
机构
[1] Karlsruhe Inst Technol, D-76131 Karlsruhe, Germany
[2] Karlsruhe Inst Technol, Databases & Informat Syst Grp, D-76131 Karlsruhe, Germany
[3] Karlsruhe Inst Technol, Databases & Informat Syst, D-76131 Karlsruhe, Germany
关键词
SQL log analysis; patterns and antipatterns; data preprocessing; E-SCIENCE;
D O I
10.1109/TKDE.2017.2772252
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today, many scientific data sets are open to the public. For their operators, it is important to know what the users are interested in. In this paper, we study the problem of extracting and analyzing patterns from the query log of a database. We focus on design errors (antipatterns), which typically lead to unnecessary SQL statements. Such antipatterns do not only have a negative effect on performance. They also introduce bias on any subsequent analysis of the SQL log. We propose a framework designed to discover patterns and antipatterns in arbitrary SQL query logs and to clean antipatterns. To study the usefulness of our approach and to reveal insights regarding the existence of antipatterns in real-world systems, we examine the SQL log of the SkyServer project, containing more than 40 million queries. Among the top 15 patterns, we have found six antipatterns. This result as well as other ones gives way to the conclusion that antipatterns might falsify refactoring and any other downstream analyses.
引用
收藏
页码:421 / 434
页数:14
相关论文
共 50 条
  • [1] Cleaning Antipatterns in an SQL Query Log
    Arzamasova, Natalia
    Schaeler, Martin
    Boehm, Klemens
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1751 - 1752
  • [2] SQL Antipatterns Detection and Database Refactoring Process
    Khumnin, Poonyanuch
    Senivongse, Twittie
    2017 18TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNDP 2017), 2017, : 199 - 205
  • [3] BIBSQLQC: Brown infomax boosted SQL query clustering algorithm to detect anti-patterns in the query log
    Ramakrishnan, Vinothsaravanan
    Palanisamy, Chenniappan
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2020, 28 (04) : 2200 - 2212
  • [4] SAND: A Static Analysis Approach for Detecting SQL Antipatterns
    Lyu, Yingjun
    Volokh, Sasha
    Halfond, William G. J.
    Tripp, Omer
    ISSTA '21: PROCEEDINGS OF THE 30TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, 2021, : 270 - 282
  • [5] Quantifying the Performance Impact of SQL Antipatterns on Mobile Applications
    Lyu, Yingjun
    Alotaibi, Ali
    Halfond, William G. J.
    2019 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2019), 2019, : 53 - 64
  • [6] SQL Query optimalization
    Cerna, Eva
    Herold, Petr
    Tyrychtr, Jan
    AGRARIAN PERSPECTIVES XVIII, VOL 3, 2009, : 71 - 74
  • [7] SQL Query optimalization
    Cerna, Eva
    Herold, Petr
    Tyrychtr, Jan
    AGRARIAN PERSPECTIVES XVIII, VOLS 1 AND 2, 2009,
  • [8] Fine-grained Log Auditing based on Secure OS, User Command and SQL Query
    Park, Tae-Kyou
    Koo, Ha-Sung
    INTERNATIONAL CONFERENCE ON ADVANCES SCIENCE AND CONTEMPORARY ENGINEERING 2012, 2012, 50 : 381 - 387
  • [9] Keyword Query Cleaning with Query Logs
    Gao, Lei
    Yu, Xiaohui
    Liu, Yang
    WEB-AGE INFORMATION MANAGEMENT, 2011, 6897 : 31 - 42
  • [10] (ANACON: SQL QUERY ANALYZER)
    Garrido, Piedad
    Martinez, Francisco
    Tramullas, Jesus
    Fuertes, Gabriel
    RIED-REVISTA IBEROAMERICANA DE EDUCACION A DISTANCIA, 2007, 10 (01): : 201 - 215