More accurate streaming cardinality estimation with vectorized counters

被引:1
|
作者
Bruschi, Valerio [1 ]
Reviriego, Pedro [2 ]
Pontarelli, Salvatore [3 ]
Ting, Daniel [4 ]
Bianchi, Giuseppe [1 ]
机构
[1] Bruschi, Valerio
[2] Reviriego, Pedro
[3] Pontarelli, Salvatore
[4] Ting, Daniel
[5] Bianchi, Giuseppe
来源
Bruschi, Valerio (valerio.bruschi@uniroma2.it) | 1600年 / Institute of Electrical and Electronics Engineers Inc.卷 / 03期
关键词
D O I
10.1109/LNET.2021.3076048
中图分类号
学科分类号
摘要
Cardinality estimation, also known as count-distinct, is the problem of finding the number of different elements in a set with repeated elements. Among the many approximate algorithms proposed for this task, HyperLogLog (HLL) has established itself as the state of the art due to its ability to accurately estimate cardinality over a large range of values using a small memory footprint. When elements arrive in a stream, as in the case of most networking applications, improved techniques are possible. We specifically propose a new algorithm that improves the accuracy of cardinality estimation by grouping counters, and by using their new organization to further track all updates within a given counter size range (compared with just the last update as in the standard HLL). Results show that when using the same number of counters, one configuration of the new scheme reduces the relative error by approximately 0.86x using the same amount of memory as the streaming HLL and another configuration achieves a similar accuracy reducing the memory needed by approximately 0.85x. © 2019 IEEE.
引用
收藏
页码:75 / 79
相关论文
共 50 条
  • [1] More accurate cardinality estimation in data streams
    Lu, Jie
    Chen, Hongchang
    Zhang, Zheng
    Xie, Jichao
    ELECTRONICS LETTERS, 2022, 58 (25) : 982 - 984
  • [2] Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs
    Chen, Jeremy
    Huang, Yuqing
    Wang, Mushi
    Salihoglu, Semih
    Salem, Kenneth
    SIGMOD RECORD, 2023, 52 (01) : 94 - 102
  • [3] Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs
    Chen, Jeremy
    Huang, Yuqing
    Wang, Mushi
    Salihoglu, Semih
    Salem, Ken
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (08): : 1533 - 1545
  • [4] SuperGuardian: Superspreader removal for cardinality estimation in data streaming
    Lu, Jie
    Chen, Hongchang
    Sun, Penghao
    Hu, Tao
    Zhang, Zhen
    Ren, Quan
    INFORMATION SYSTEMS, 2024, 122
  • [5] ntCard: a streaming algorithm for cardinality estimation in genomics data
    Mohamadi, Hamid
    Khan, Hamza
    Birol, Inanc
    BIOINFORMATICS, 2017, 33 (09) : 1324 - 1330
  • [7] HyperLogLogLog: Cardinality Estimation With One Log More
    Karppa, Matti
    Pagh, Rasmus
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 753 - 761
  • [8] FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation
    Zhu, Rong
    Wu, Ziniu
    Han, Yuxing
    Zeng, Kai
    Pfadler, Andreas
    Qian, Zhengping
    Zhou, Jingren
    Cui, Bin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (09): : 1489 - 1502
  • [9] A Unified Approach for Fast and Accurate Cardinality Estimation in RFID Systems
    Jiang, Wenchao
    Zhu, Yanmin
    2014 IEEE 11TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SENSOR SYSTEMS (MASS), 2014, : 407 - 415
  • [10] Fast and Accurate Cardinality Estimation by Self-Morphing Bitmaps
    Wang, Haibo
    Ma, Chaoyi
    Chen, Shigang
    Wang, Yuanda
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2022, 30 (04) : 1674 - 1688