More accurate streaming cardinality estimation with vectorized counters

被引:1
|
作者
Bruschi, Valerio [1 ]
Reviriego, Pedro [2 ]
Pontarelli, Salvatore [3 ]
Ting, Daniel [4 ]
Bianchi, Giuseppe [1 ]
机构
[1] Bruschi, Valerio
[2] Reviriego, Pedro
[3] Pontarelli, Salvatore
[4] Ting, Daniel
[5] Bianchi, Giuseppe
来源
Bruschi, Valerio (valerio.bruschi@uniroma2.it) | 1600年 / Institute of Electrical and Electronics Engineers Inc.卷 / 03期
关键词
D O I
10.1109/LNET.2021.3076048
中图分类号
学科分类号
摘要
Cardinality estimation, also known as count-distinct, is the problem of finding the number of different elements in a set with repeated elements. Among the many approximate algorithms proposed for this task, HyperLogLog (HLL) has established itself as the state of the art due to its ability to accurately estimate cardinality over a large range of values using a small memory footprint. When elements arrive in a stream, as in the case of most networking applications, improved techniques are possible. We specifically propose a new algorithm that improves the accuracy of cardinality estimation by grouping counters, and by using their new organization to further track all updates within a given counter size range (compared with just the last update as in the standard HLL). Results show that when using the same number of counters, one configuration of the new scheme reduces the relative error by approximately 0.86x using the same amount of memory as the streaming HLL and another configuration achieves a similar accuracy reducing the memory needed by approximately 0.85x. © 2019 IEEE.
引用
收藏
页码:75 / 79
相关论文
共 50 条
  • [21] MORE ON COUNTERS
    NAWALINSKI, T
    EDN MAGAZINE-ELECTRICAL DESIGN NEWS, 1979, 24 (16): : 16 - 16
  • [22] MORE ACCURATE ESTIMATION OF THE EXTENT OF GASTRIC RESECTION
    EVERSON, TC
    ANNALS OF SURGERY, 1954, 140 (02) : 260 - 260
  • [23] Towards a More Accurate Knowledge Level Estimation
    Kardan, Samad
    Kardan, Ahmad
    PROCEEDINGS OF THE 2009 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, VOLS 1-3, 2009, : 1134 - 1139
  • [24] A More Accurate Estimation of Semiparametric Logistic Regression
    Zheng, Xia
    Rong, Yaohua
    Liu, Ling
    Cheng, Weihu
    MATHEMATICS, 2021, 9 (19)
  • [25] Towards More Efficient Cardinality Estimation for Large-Scale RFID Systems
    Zheng, Yuanqing
    Li, Mo
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2014, 22 (06) : 1886 - 1896
  • [26] MORE TAPE COUNTERS
    GILLMAN, L
    AMERICAN MATHEMATICAL MONTHLY, 1993, 100 (03): : 286 - 286
  • [27] Forest of Normalized Trees: Fast and Accurate Density Estimation of Streaming Data
    Rehn, Patrick
    Ahmadi, Zahra
    Kramer, Stefan
    2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 199 - 208
  • [28] More on the cardinality of a topological space
    Bonanzinga, M.
    Carlson, N.
    Cuzzupe, M., V
    Stavrova, D.
    APPLIED GENERAL TOPOLOGY, 2018, 19 (02): : 269 - 280
  • [29] Cuckoo Counter: Adaptive Structure of Counters for Accurate Frequency and Top-k Estimation
    Shi, Qilong
    Xu, Yuchen
    Qi, Jiuhua
    Li, Wenjun
    Yang, Tong
    Xu, Yang
    Wang, Yi
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (04) : 1854 - 1869
  • [30] Toward More Rigorous and Practical Cardinality Estimation for Large-Scale RFID Systems
    Gong, Wei
    Liu, Jiangchuan
    Liu, Kebin
    Liu, Yunhao
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2017, 25 (03) : 1347 - 1358