Understanding Data Access Patterns for dCache System

被引:0
|
作者
Bellavita, Julian [1 ]
Sim, Caitlin [1 ]
Wu, Kesheng [2 ]
Sim, Alex [2 ]
Yoo, Shinjae [3 ]
Ito, Hiro [3 ]
Garonne, Vincent [3 ]
Lancon, Eric
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Natl Lab, Berkeley, CA USA
[3] Brookhaven Natl Lab, Upton, NY USA
来源
26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023 | 2024年 / 295卷
关键词
D O I
10.1051/epjconf/202429501053
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The storage management system dCache acts as a disk cache for high-energy physics (HEP) data from the US ATLAS community. Since its disk capacity is considerably smaller than the total volume of ATLAS data, a heuristic is needed to determine what data should be kept on disks. An effective heuristic would be to keep the data files that are expected to be heavily accessed in the near future. Through a careful study of access statistics, we find a few most popular datasets are accessed much more frequently than others, even though these popular datasets change over time. If we could predict the near-term popularity of datasets, we could pin the most popular ones in the disk cache to prevent their accidental removal and guarantee their availability. To predict a dataset popularity, we present several methods for forecasting the number of times a dataset will be accessed in the next day. Test results show that these methods could predict the next-day access counts of popular datasets reliably. This observation is confirmed with dCache logs from two separate time ranges.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Understanding bike trip patterns leveraging bike sharing system open data
    Chen, Longbiao
    Ma, Xiaojuan
    Thi-Mai-Trang Nguyen
    Pan, Gang
    Jakubowicz, Jeremie
    FRONTIERS OF COMPUTER SCIENCE, 2017, 11 (01) : 38 - 48
  • [22] Data preservation for the HERA experiments at DESY using dCache technology
    Kruecker, Dirk
    Schwank, Karsten
    Fuhrmann, Patrick
    Lewendel, Birgit
    South, David M.
    21ST INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP2015), PARTS 1-9, 2015, 664
  • [23] Utilizing Lustre file system with dCache for CMS analysis
    Wu, Y.
    Kim, B.
    Rodriguez, J. L.
    Fu, Y.
    Bourilkov, D.
    Avery, P.
    17TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP09), 2010, 219
  • [24] Understanding Patterns for System of Systems Integration
    Kazman, Rick
    Schmid, Klaus
    Nielsen, Claus Ballegaard
    Klein, John
    2013 8TH INTERNATIONAL CONFERENCE ON SYSTEM OF SYSTEMS ENGINEERING (SOSE), 2013, : 141 - 146
  • [25] Understanding Access Stability and Mobility Patterns of Mobile Internet Users
    Li, Yuan
    Jiang, Hao
    Yi, Shuwen
    Zhou, Chen
    Wu, Lihua
    Wu, Ming
    CIT/IUCC/DASC/PICOM 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - UBIQUITOUS COMPUTING AND COMMUNICATIONS - DEPENDABLE, AUTONOMIC AND SECURE COMPUTING - PERVASIVE INTELLIGENCE AND COMPUTING, 2015, : 18 - 22
  • [26] Understanding memory access patterns using the BSC performance tools
    Servat, Harald
    Labarta, Jesus
    Hoppe, Hans-Christian
    Gimenez, Judit
    Pena, Antonio J.
    PARALLEL COMPUTING, 2018, 78 : 1 - 14
  • [27] From access to understanding: Collective data governance for workers
    Calacci, Dan
    Stein, Jake
    EUROPEAN LABOUR LAW JOURNAL, 2023, 14 (02) : 253 - 282
  • [28] External statistical data: Understanding users and improving access
    Hyland, P
    Gould, T
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 1998, 10 (01) : 71 - 83
  • [29] Understanding the spatiotemporal patterns of Cross Encounter Behaviour of ships Based on Automatic Identification System Data
    Gu, Kewang
    Song, Jie
    Ma, Zhicheng
    Sun, Xiaoyu
    Zhou, Wenxian
    Zhang, Liye
    2024 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS, ICCCS 2024, 2024, : 1210 - 1215
  • [30] SOCIAL DATA VISUALIZATION SYSTEM FOR UNDERSTANDING DIFFUSION PATTERNS ON TWITTER: A CASE STUDY ON KOREAN ENTERPRISES
    Hwang, Dosam
    Jung, Jai E.
    Park, Seungbo
    Nguyen, Hien T.
    COMPUTING AND INFORMATICS, 2014, 33 (03) : 591 - 608