Cleaning Antipatterns in an SQL Query Log

被引:12
|
作者
Arzamasova, Natalia [1 ]
Schaeler, Martin [2 ]
Bohm, Klemens [3 ]
机构
[1] Karlsruhe Inst Technol, D-76131 Karlsruhe, Germany
[2] Karlsruhe Inst Technol, Databases & Informat Syst Grp, D-76131 Karlsruhe, Germany
[3] Karlsruhe Inst Technol, Databases & Informat Syst, D-76131 Karlsruhe, Germany
关键词
SQL log analysis; patterns and antipatterns; data preprocessing; E-SCIENCE;
D O I
10.1109/TKDE.2017.2772252
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today, many scientific data sets are open to the public. For their operators, it is important to know what the users are interested in. In this paper, we study the problem of extracting and analyzing patterns from the query log of a database. We focus on design errors (antipatterns), which typically lead to unnecessary SQL statements. Such antipatterns do not only have a negative effect on performance. They also introduce bias on any subsequent analysis of the SQL log. We propose a framework designed to discover patterns and antipatterns in arbitrary SQL query logs and to clean antipatterns. To study the usefulness of our approach and to reveal insights regarding the existence of antipatterns in real-world systems, we examine the SQL log of the SkyServer project, containing more than 40 million queries. Among the top 15 patterns, we have found six antipatterns. This result as well as other ones gives way to the conclusion that antipatterns might falsify refactoring and any other downstream analyses.
引用
收藏
页码:421 / 434
页数:14
相关论文
共 50 条
  • [21] UNMASQUE: A Hidden SQL Query Extractor
    Khurana, Kapil
    Haritsa, Jayant R.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (12): : 2809 - 2812
  • [22] SQL query extensions for imprecise questions
    Le Guilly, Marie
    Petit, Jean-Marc
    Scuturici, Vasile-Marian
    DATA & KNOWLEDGE ENGINEERING, 2022, 137
  • [23] Similarity Metrics for SQL Query Clustering
    Kul, Gokhan
    Duc Thanh Anh Luong
    Xie, Ting
    Chandola, Varun
    Kennedy, Oliver
    Upadhyaya, Shambhu
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (12) : 2408 - 2420
  • [24] USABILITY OF SQL AND MENUS FOR DATABASE QUERY
    DAVIS, JS
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1989, 30 (04): : 447 - 455
  • [25] SPATIAL SQL - A QUERY AND PRESENTATION LANGUAGE
    EGENHOFER, MJ
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1994, 6 (01) : 86 - 95
  • [26] Containerized SQL Query Evaluation in a Cloud
    Zhang, Weining
    Holland, David
    2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 1010 - 1017
  • [27] A Novel Approach for SQL Query Optimization
    Mithani, Fazal
    Machchhar, Sahista
    Jasdanwala, Fernaz
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH, 2016, : 898 - 901
  • [28] Query Processing: Beyond SQL and Relations
    Novikov, Boris
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2015, 2015, 9282
  • [29] QUERY OPTIMIZATION IN MICROSOFT SQL SERVER
    Haxhijaha, Blerta
    Ajdari, Jaumin
    Raufi, Bujar
    Zenuni, Xhemal
    Ismaili, Florie
    INTERNATIONAL JOURNAL ON INFORMATION TECHNOLOGIES AND SECURITY, 2018, 10 (02): : 13 - 22
  • [30] Errors and Complications in SQL Query Formulation
    Taipalus, Toni
    Siponen, Mikko
    Vartiainen, Tero
    ACM TRANSACTIONS ON COMPUTING EDUCATION, 2018, 18 (03):