Computing Prominent Skyline on Massive Data

被引:0
|
作者
Wan, Xiaolong [1 ]
Han, Xixian [1 ]
Wang, Jinbao [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, 92 Xidazhi St, Harbin, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
<italic>P</italic>-skyline; Massive data; Selective retrieval; Selective checking; COMPUTATION; ALGORITHMS; QUERIES;
D O I
10.1007/s41019-024-00259-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many practical applications, skyline query is an important operation to return the pareto optimal tuples, which provides a candidate set for the optimum. On massive data, skyline often reports too many results, the users will be overwhelmed and be difficult to find the desired information easily. This paper devises P-skyline to reduce the size of the returned results. Given the approximation factor, P-skyline only generates the prominent skyline results by the definition of p-dominance. To the best of our knowledge, this paper is the first work to study P-skyline problem. This paper first proposes a baseline algorithm, which requires one full table scan to compute the results. It is found that baseline algorithm incurs a relatively high execution cost on massive data. Then, PSTP algorithm is proposed, which consists of two stages: candidate acquisition and refinement. On the presorted table, PSTP utilizes selective retrieval and selective checking to process P-skyline with much lower I/O cost and computation cost. The extensive experimental results, conducted on synthetic and real-life data sets, show that PSTP can compute P-skyline on massive data efficiently.
引用
收藏
页码:117 / 146
页数:30
相关论文
共 50 条
  • [31] Performance analysis for massive problem data parallel computing
    Shu, Jiwu
    Zheng, Weimin
    Shen, Meiming
    Wang, Dongsheng
    Ruan Jian Xue Bao/Journal of Software, 2000, 11 (05): : 628 - 633
  • [32] Computing River Floods Using Massive Terrain Data
    Alexander, Cici
    Arge, Lars
    Bocher, Peder Klith
    Revsbaek, Morten
    Sandel, Brody
    Svenning, Jens-Christian
    Tsirogiannis, Constantinos
    Yang, Jungwoo
    GEOGRAPHIC INFORMATION SCIENCE, (GISCIENCE 2016), 2016, 9927 : 3 - 17
  • [33] Massive data balance scheduling in cloud computing environment
    Wei, Xiuran
    Wang, Feng
    International Journal of Mechatronics and Applied Mechanics, 2019, 2019 (05): : 100 - 105
  • [34] Research on Synchrophasor Computing Based on Massive COMTRADE Data
    Wu Da-Peng
    Wang Zhen-Shu
    2016 IEEE INTERNATIONAL CONFERENCE ON POWER AND RENEWABLE ENERGY (ICPRE), 2016, : 442 - 445
  • [35] Computing Service Skyline from Uncertain QoWS
    Yu, Qi
    Bouguettaya, Athman
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2010, 3 (01) : 16 - 29
  • [36] Computing compressed multidimensional skyline cubes efficiently
    Pei, Jian
    Fu, Ada Wai-Chee
    Lin, Xuemin
    Wang, Haixun
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 71 - +
  • [37] Computing Exact Skyline Probabilities for Uncertain Databases
    Kim, Dongwon
    Im, Hyeonseung
    Park, Sungwoo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (12) : 2113 - 2126
  • [38] Computing Range Skyline Query on Uncertain Dimension
    Saad, Nurul Husna Mohd
    Ibrahim, Hamidah
    Sidi, Fatimah
    Yaakob, Razali
    Alwan, Ali Amer
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2016, PT II, 2016, 9828 : 377 - 388
  • [39] Computing Skyline Probabilities on Uncertain Time Series
    He, Guoliang
    Chen, Lu
    Li, Zhijie
    Zheng, Qiaoxian
    Li, Yuanxiang
    NEURAL INFORMATION PROCESSING, PT III, 2015, 9491 : 61 - 71
  • [40] Efficient Processing of Continuous Skyline Query over Smarter Traffic Data Stream for Cloud Computing
    Wang Hanning
    Xu Weixiang
    Yang, Jiulin
    Wei, Lili
    Jia Chaolong
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2013, 2013