A Content-Based Approach for Modeling Analytics Operators

被引:1
|
作者
Giannakopoulos, Ioannis [1 ]
Tsoumakos, Dimitrios [2 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Sch ECE, Athens, Greece
[2] Ionian Univ, Dept Informat, Corfu, Greece
关键词
D O I
10.1145/3269206.3271731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The plethora of publicly available data sources has given birth to a wealth of new needs and opportunities. The ever increasing amount of data has shifted the analysts' attention from optimizing the operators for specific business cases, to focusing on datasets per se, selecting the ones that are most suitable for specific operators, i.e., they make an operator produce a specific output. Yet, predicting the output of a given operator executed for different input datasets is not an easy task: It entails executing the operator for all of them, something that requires excessive computational power and time. To tackle this challenge, we propose a novel dataset profiling methodology that infers an operator's outcome based on examining the similarity of the available input datasets in specific attributes. Our methodology quantifies dataset similarities and projects them into a low-dimensional space. The operator is then executed for a mere subset of the available datasets and its output for the rest of them is approximated using Neural Networks trained using this space as input. Our experimental evaluation thoroughly examines the performance of our scheme using both synthetic and real-world datasets, indicating that the suggested approach is capable of predicting an operator's output with high accuracy. Moreover, it massively accelerates operator profiling in comparison to approaches that require an exhaustive operator execution, rendering our work ideal for cases where a multitude of operators need to be executed to a set of given datasets.
引用
收藏
页码:227 / 236
页数:10
相关论文
共 50 条
  • [31] A semantic approach for content-based flash retrieval
    Feng, B
    Li, Q
    Yang, J
    Ding, DW
    Liu, WY
    PROCEEDINGS OF THE 7TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2003, : 1290 - 1294
  • [32] A symbolic approach for content-based information filtering
    Bezerra, BLD
    de Carvalho, FD
    INFORMATION PROCESSING LETTERS, 2004, 92 (01) : 45 - 52
  • [33] A new approach to coding in content-based MANETs
    Joy, Joshua
    Yu, Yu-Ting
    Perez, Victor
    Lu, Dennis
    Gerla, Mario
    Journal of Communications, 2014, 9 (08): : 588 - 596
  • [34] A hierarchical content-based image retrieval approach
    Xiong, XJ
    Chan, KL
    STORAGE AND RETRIEVAL FOR MEDIA DATABASES 2001, 2001, 4315 : 437 - 448
  • [35] A DLSI approach for content-based image classification
    Nilufar, S
    Chen, L
    Kwan, HK
    2004 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MEASUREMENT SYSTEMS AND APPLICATIONS, 2004, : 138 - 143
  • [36] A fuzzy approach to content-based image retrieval
    Medasani, S
    Krishnapuram, R
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 2, 1999, : 964 - 968
  • [38] Content-Based Document Image Retrieval Based on Document Modeling
    Chwan-Yi Shiah
    Journal of Intelligent Information Systems, 2020, 55 : 287 - 306
  • [39] User trends modeling for a content-based recommender system
    Bagher, Rahimpour Cami
    Hassanpour, Hamid
    Mashayekhi, Hoda
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 87 : 209 - 219
  • [40] Adaptive User Modeling for Content-Based Music Retrieval
    Wolter, Kay
    Bastuck, Christoph
    Gaertner, Daniel
    ADAPTIVE MULTIMEDIA RETRIEVAL: IDENTIFYING, SUMMARIZING, AND RECOMMENDING IMAGE AND MUSIC, 2010, 5811 : 40 - +