Computing Prominent Skyline on Massive Data

被引:0
|
作者
Wan, Xiaolong [1 ]
Han, Xixian [1 ]
Wang, Jinbao [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, 92 Xidazhi St, Harbin, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
<italic>P</italic>-skyline; Massive data; Selective retrieval; Selective checking; COMPUTATION; ALGORITHMS; QUERIES;
D O I
10.1007/s41019-024-00259-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many practical applications, skyline query is an important operation to return the pareto optimal tuples, which provides a candidate set for the optimum. On massive data, skyline often reports too many results, the users will be overwhelmed and be difficult to find the desired information easily. This paper devises P-skyline to reduce the size of the returned results. Given the approximation factor, P-skyline only generates the prominent skyline results by the definition of p-dominance. To the best of our knowledge, this paper is the first work to study P-skyline problem. This paper first proposes a baseline algorithm, which requires one full table scan to compute the results. It is found that baseline algorithm incurs a relatively high execution cost on massive data. Then, PSTP algorithm is proposed, which consists of two stages: candidate acquisition and refinement. On the presorted table, PSTP utilizes selective retrieval and selective checking to process P-skyline with much lower I/O cost and computation cost. The extensive experimental results, conducted on synthetic and real-life data sets, show that PSTP can compute P-skyline on massive data efficiently.
引用
收藏
页码:117 / 146
页数:30
相关论文
共 50 条
  • [1] Computing Prominent Skyline on Massive DataComputing Prominent Skyline on Massive DataX. Wan et al.
    Xiaolong Wan
    Xixian Han
    Jinbao Wang
    Data Science and Engineering, 2025, 10 (1) : 117 - 146
  • [2] Dynamic skyline computation on massive data
    Xixian Han
    Bailing Wang
    Guojun Lai
    Knowledge and Information Systems, 2019, 59 : 571 - 599
  • [3] Dynamic skyline computation on massive data
    Han, Xixian
    Wang, Bailing
    Lai, Guojun
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 59 (03) : 571 - 599
  • [4] Efficient Skyline Computation on Massive Incomplete Data
    He, Jingxuan
    Han, Xixian
    DATA SCIENCE AND ENGINEERING, 2022, 7 (02) : 102 - 119
  • [5] Efficient Skyline Computation on Massive Incomplete Data
    Jingxuan He
    Xixian Han
    Data Science and Engineering, 2022, 7 : 102 - 119
  • [6] Constrained Skyline Computing over Data Streams
    Lin, Jin-xian
    Wei, Jing-jing
    PROCEEDINGS OF THE ICEBE 2008: IEEE INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING, 2008, : 155 - +
  • [7] Modeling and Computing Probabilistic Skyline on Incomplete Data
    Zhang, Kaiqi
    Gao, Hong
    Han, Xixian
    Cai, Zhipeng
    Li, Jianzhong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (07) : 1405 - 1418
  • [8] Computing All Skyline Probabilities for Uncertain Data
    Atallah, Mikhail J.
    Qi, Yinian
    PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, : 279 - 287
  • [9] SEPT: an efficient skyline join algorithm on massive data
    Xixian Han
    Jianzhong Li
    Hong Gao
    Chengyu Yang
    Knowledge and Information Systems, 2015, 43 : 355 - 388
  • [10] SEPT: an efficient skyline join algorithm on massive data
    Han, Xixian
    Li, Jianzhong
    Gao, Hong
    Yang, Chengyu
    KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 43 (02) : 355 - 388