Cleaning Antipatterns in an SQL Query Log

被引:12
|
作者
Arzamasova, Natalia [1 ]
Schaeler, Martin [2 ]
Bohm, Klemens [3 ]
机构
[1] Karlsruhe Inst Technol, D-76131 Karlsruhe, Germany
[2] Karlsruhe Inst Technol, Databases & Informat Syst Grp, D-76131 Karlsruhe, Germany
[3] Karlsruhe Inst Technol, Databases & Informat Syst, D-76131 Karlsruhe, Germany
关键词
SQL log analysis; patterns and antipatterns; data preprocessing; E-SCIENCE;
D O I
10.1109/TKDE.2017.2772252
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today, many scientific data sets are open to the public. For their operators, it is important to know what the users are interested in. In this paper, we study the problem of extracting and analyzing patterns from the query log of a database. We focus on design errors (antipatterns), which typically lead to unnecessary SQL statements. Such antipatterns do not only have a negative effect on performance. They also introduce bias on any subsequent analysis of the SQL log. We propose a framework designed to discover patterns and antipatterns in arbitrary SQL query logs and to clean antipatterns. To study the usefulness of our approach and to reveal insights regarding the existence of antipatterns in real-world systems, we examine the SQL log of the SkyServer project, containing more than 40 million queries. Among the top 15 patterns, we have found six antipatterns. This result as well as other ones gives way to the conclusion that antipatterns might falsify refactoring and any other downstream analyses.
引用
收藏
页码:421 / 434
页数:14
相关论文
共 50 条
  • [41] PrivateSQL: A Differentially Private SQL Query Engine
    Kotsogiannis, Ios
    Tao, Yuchao
    He, Xi
    Fanaeepour, Maryam
    Machanavajjhala, Ashwin
    Hay, Michael
    Miklau, Gerome
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (11): : 1371 - 1384
  • [42] Mention Extraction and Linking for SQL Query Generation
    Ma, Jianqiang
    Yan, Zeyu
    Pang, Shuai
    Zhang, Yang
    Shen, Jianping
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6936 - 6942
  • [43] Towards query translation from XQL to SQL
    Fong, J
    Dillon, T
    KNOWLEDGE MANAGEMENT & INTELLIGENT ENTERPRISES, 2001, : 113 - 129
  • [44] Evaluation of Sub Query Performance in SQL Server
    Oktavia, Tanty
    Sujarwo, Surya
    ICASCE 2013 - INTERNATIONAL CONFERENCE ON ADVANCES SCIENCE AND CONTEMPORARY ENGINEERING, 2014, 68
  • [45] CorrectDB: SQL Engine with Practical Query Authentication
    Bajaj, Sumeet
    Sion, Radu
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (07): : 529 - 540
  • [46] Generalizing and Improving SQL/XML Query Evaluation
    Boettcher, Stefan
    Bokermann, Dennis
    Hartel, Rita
    8TH INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS 2012), 2012, : 441 - 449
  • [47] A Study on Database Fuzzy Query Method in SQL
    Zhang Peng
    INTERNATIONAL CONFERENCE ON ADVANCES IN ENGINEERING 2011, 2011, 24 : 340 - 344
  • [48] QED: A Powerful Query Equivalence Decider for SQL
    Wang, Shuxian
    Pan, Sicheng
    Cheung, Alvin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (11): : 3602 - 3614
  • [49] ON OPTIMIZING AN SQL-LIKE NESTED QUERY
    KIM, W
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 1982, 7 (03): : 443 - 469
  • [50] Translating XQuery to SQL based on query forests
    Chang, YH
    Liu, G
    Wu, SS
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2005, 3453 : 894 - 899