More accurate streaming cardinality estimation with vectorized counters

被引:1
|
作者
Bruschi, Valerio [1 ]
Reviriego, Pedro [2 ]
Pontarelli, Salvatore [3 ]
Ting, Daniel [4 ]
Bianchi, Giuseppe [1 ]
机构
[1] Bruschi, Valerio
[2] Reviriego, Pedro
[3] Pontarelli, Salvatore
[4] Ting, Daniel
[5] Bianchi, Giuseppe
来源
Bruschi, Valerio (valerio.bruschi@uniroma2.it) | 1600年 / Institute of Electrical and Electronics Engineers Inc.卷 / 03期
关键词
D O I
10.1109/LNET.2021.3076048
中图分类号
学科分类号
摘要
Cardinality estimation, also known as count-distinct, is the problem of finding the number of different elements in a set with repeated elements. Among the many approximate algorithms proposed for this task, HyperLogLog (HLL) has established itself as the state of the art due to its ability to accurately estimate cardinality over a large range of values using a small memory footprint. When elements arrive in a stream, as in the case of most networking applications, improved techniques are possible. We specifically propose a new algorithm that improves the accuracy of cardinality estimation by grouping counters, and by using their new organization to further track all updates within a given counter size range (compared with just the last update as in the standard HLL). Results show that when using the same number of counters, one configuration of the new scheme reduces the relative error by approximately 0.86x using the same amount of memory as the streaming HLL and another configuration achieves a similar accuracy reducing the memory needed by approximately 0.85x. © 2019 IEEE.
引用
收藏
页码:75 / 79
相关论文
共 50 条
  • [31] A vectorized algorithm for correlation dimension estimation
    Toledo, E
    Toledo, S
    Almog, Y
    Akselrod, S
    PHYSICS LETTERS A, 1997, 229 (06) : 375 - 378
  • [32] Training for more accurate visual fat estimation in meat
    Kroeze, JHA
    Wijngaards, G
    Padding, P
    Linschoten, MRI
    Theelen-Uijtewaal, B
    MEAT SCIENCE, 2000, 54 (04) : 319 - 324
  • [33] Neural confidence estimation for more accurate value prediction
    Black, M
    Franklin, M
    HIGH PERFORMANCE COMPUTING - HIPC 2005, PROCEEDINGS, 2005, 3769 : 376 - 385
  • [34] A wire load model for more accurate power estimation
    Windschiegl, A
    Zuber, P
    Stechele, W
    2002 45TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL I, CONFERENCE PROCEEDINGS, 2002, : 376 - 379
  • [35] Towards More Accurate Uncertainty Estimation In Text Classification
    He, Jianfeng
    Zhang, Xuchao
    Lei, Shuo
    Chen, Zhiqian
    Chen, Fanglan
    Alhamadani, Abdulaziz
    Xiao, Bei
    Lu, Chang-Tien
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8362 - 8372
  • [36] More Accurate Estimation of Shortest Paths in Social Networks
    Feng, Chaobing
    Deng, Ting
    2018 17TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS ENGINEERING AND SCIENCE (DCABES), 2018, : 314 - 317
  • [37] An Optimal Streaming Algorithm for Submodular Maximization with a Cardinality Constraint
    Alaluf, Naor
    Ene, Alina
    Feldman, Moran
    Nguyen, Huy L.
    Suh, Andrew
    MATHEMATICS OF OPERATIONS RESEARCH, 2022, 47 (04) : 2667 - 2690
  • [38] Cardinality Estimation: An Experimental Survey
    Harmouch, Hazar
    Naumann, Felix
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 11 (04): : 499 - 512
  • [39] Deep Unsupervised Cardinality Estimation
    Yang, Zongheng
    Liang, Eric
    Kamsetty, Amog
    Wu, Chenggang
    Duan, Yan
    Chen, Xi
    Abbeel, Pieter
    Hellerstein, Joseph M.
    Krishnan, Sanjay
    Stoica, Ion
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 13 (03): : 279 - 292
  • [40] An Accurate Viewport Estimation Method for 360 Video Streaming using Deep Learning
    Hung N.V.
    Ngan D.T.
    Son P.N.
    Long D.T.
    Dung N.T.
    Huong T.T.
    EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 2022, 9 (04)