The Performance Analysis of Distributed Storage Systems Used in Scalable Web Systems

被引:0
|
作者
Oles, Dominik [1 ]
Nowak, Ziemowit [2 ]
机构
[1] Tieto Czech Sro, 28 Rijna 3346-91, Ostrava 70200, Czech Republic
[2] Wroclaw Univ Sci & Technol, Fac Comp Sci & Management, Wybrzeze Wyspianskiego 27, PL-50370 Wroclaw, Poland
关键词
Big Data; Hadoop; HBase; Kudu;
D O I
10.1007/978-3-319-99981-4_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scalable web systems are directly related to distributed storage systems used to process large amounts of data (big data). An example of such a system is Hadoop with its many extensions supporting data storage such as SQL-on-Hadoop systems and the "Parquet" file format. Another kind of systems for storing and processing big data are NoSQL databases, such as HBase, which are used in applications requiring fast random access. The Kudu system was created to combine the advantages of Hadoop and HBase and enable both effective data set analysis and fast random access. As subject of the research, performance analysis of the mentioned systems was performed. The experiment was conducted in the Amazon Web Services public cloud environment, where the cluster of nine virtual machines was configured. For research purpose, containing about billion rows fragment of "Wikipedia Page Traffic Statistics" public dataset was used. The results of the measurements confirm that the Kudu system is a promising alternative to the commonly used technologies.
引用
收藏
页码:287 / 298
页数:12
相关论文
共 50 条
  • [1] Geographic load balancing for scalable distributed web systems
    Cardellini, V
    Colajanni, M
    Yu, PS
    8TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, PROCEEDINGS, 2000, : 20 - 27
  • [2] Performance Analysis on Distributed Storage Systems in Ring Networks
    Qu, Shan
    Zhang, Qin
    Zhang, Jinbei
    Sun, Yuan
    Wang, Xinbing
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (07) : 7762 - 7777
  • [3] Performance Analysis on Distributed Storage Systems in Ring Networks
    Zhang, Qin
    Sun, Yuan
    Qu, Shan
    Zhang, Jinbei
    Wang, Xinbing
    2017 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA (ICCC), 2017, : 459 - 464
  • [4] Analysis of task assignment policies in scalable distributed web-server systems
    Colajanni, M
    Yu, PS
    Dias, DM
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1998, 9 (06) : 585 - 600
  • [5] Simulative performance analysis of gossip failure detection for scalable distributed systems
    Mark W. Burns
    Alan D. George
    Bradley A. Wallace
    Cluster Computing, 1999, 2 (3) : 207 - 217
  • [6] Fighting with Unknowns: Estimating the Performance of Scalable Distributed Storage Systems with Minimal Measurement Data
    Ra, Moo-Ryong
    Lee, Hee Won
    2019 35TH SYMPOSIUM ON MASS STORAGE SYSTEMS AND TECHNOLOGIES (MSST 2019), 2019, : 1 - 6
  • [7] Performance Bug Analysis and Detection for Distributed Storage and Computing Systems
    Li, Jiaxin
    Zhang, Yiming
    Lu, Shan
    Gunawi, Haryadi S.
    Gu, Xiaohui
    Huang, Feng
    Li, Dongsheng
    ACM TRANSACTIONS ON STORAGE, 2023, 19 (03)
  • [8] Using comprehensive analysis for performance debugging in distributed storage systems
    Leung, Andrew W.
    Lalonde, Eric
    Telleen, Jacob
    Davis, James
    Maltzahn, Carlos
    24TH IEEE CONFERENCE ON MASS STORAGE SYSTEMS AND TECHNOLOGIES, PROCEEDINGS, 2007, : 281 - 286
  • [10] Design and performance analysis of distributed fault tolerant storage systems
    Jiang, Minghua
    Hu, Ming
    GENERAL SYSTEM AND CONTROL SYSTEM, VOL I, 2007, : 245 - 248