A Content-Based Approach for Modeling Analytics Operators

被引:1
|
作者
Giannakopoulos, Ioannis [1 ]
Tsoumakos, Dimitrios [2 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Sch ECE, Athens, Greece
[2] Ionian Univ, Dept Informat, Corfu, Greece
关键词
D O I
10.1145/3269206.3271731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The plethora of publicly available data sources has given birth to a wealth of new needs and opportunities. The ever increasing amount of data has shifted the analysts' attention from optimizing the operators for specific business cases, to focusing on datasets per se, selecting the ones that are most suitable for specific operators, i.e., they make an operator produce a specific output. Yet, predicting the output of a given operator executed for different input datasets is not an easy task: It entails executing the operator for all of them, something that requires excessive computational power and time. To tackle this challenge, we propose a novel dataset profiling methodology that infers an operator's outcome based on examining the similarity of the available input datasets in specific attributes. Our methodology quantifies dataset similarities and projects them into a low-dimensional space. The operator is then executed for a mere subset of the available datasets and its output for the rest of them is approximated using Neural Networks trained using this space as input. Our experimental evaluation thoroughly examines the performance of our scheme using both synthetic and real-world datasets, indicating that the suggested approach is capable of predicting an operator's output with high accuracy. Moreover, it massively accelerates operator profiling in comparison to approaches that require an exhaustive operator execution, rendering our work ideal for cases where a multitude of operators need to be executed to a set of given datasets.
引用
收藏
页码:227 / 236
页数:10
相关论文
共 50 条
  • [41] Content-based influence modeling for opinion behavior prediction
    1600, Association for Computational Linguistics, ACL Anthology
  • [42] A content-based recommendation approach based on singular value decomposition
    Colace, Francesco
    Conte, Dajana
    De Santo, Massimo
    Lombardi, Marco
    Santaniello, Domenico
    Valentino, Carmine
    CONNECTION SCIENCE, 2022, 34 (01) : 2158 - 2176
  • [43] A Kernel-based Approach for Content-based Image Retrieval
    Karmakar, Priyabrata
    Teng, Shyh Wei
    Lu, Guojun
    Zhang, Dengsheng
    2018 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2018,
  • [44] A UIM/ICM Based Approach to Content-Based Image Retrieval
    Li, Bo
    Miao, Zhenjiang
    Qin, Zhen
    Liu, Wenju
    2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,
  • [45] An attention-based approach to content-based image retrieval
    Bamidele, A
    Stentiford, FWM
    Morphett, J
    BT TECHNOLOGY JOURNAL, 2004, 22 (03) : 151 - 160
  • [46] A CONTENT-BASED APPROACH FOR SALIENCY ESTIMATION IN 360 IMAGES
    Mazumdar, Pramit
    Battisti, Federica
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3197 - 3201
  • [47] A efficient approach for content-based color image retrieval
    Gong, SR
    Xiong, Z
    Sun, WY
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN & COMPUTER GRAPHICS, 1999, : 1258 - 1262
  • [48] An effective approach towards content-based image retrieval
    Missaoui, R
    Sarifuddin, M
    Vaillancourt, J
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2004, 3115 : 335 - 343
  • [49] Content-based Approach for Vietnamese Spam SMS Filtering
    Pham, Thai-Hoang
    Le-Hong, Phuong
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 41 - 44
  • [50] A content-based approach for detecting highlights in action movies
    Mei-Chen Yeh
    Yen-Wei Tsai
    Hao-Chen Hsu
    Multimedia Systems, 2016, 22 : 287 - 295