The Performance Analysis of Distributed Storage Systems Used in Scalable Web Systems

被引:0
|
作者
Oles, Dominik [1 ]
Nowak, Ziemowit [2 ]
机构
[1] Tieto Czech Sro, 28 Rijna 3346-91, Ostrava 70200, Czech Republic
[2] Wroclaw Univ Sci & Technol, Fac Comp Sci & Management, Wybrzeze Wyspianskiego 27, PL-50370 Wroclaw, Poland
关键词
Big Data; Hadoop; HBase; Kudu;
D O I
10.1007/978-3-319-99981-4_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scalable web systems are directly related to distributed storage systems used to process large amounts of data (big data). An example of such a system is Hadoop with its many extensions supporting data storage such as SQL-on-Hadoop systems and the "Parquet" file format. Another kind of systems for storing and processing big data are NoSQL databases, such as HBase, which are used in applications requiring fast random access. The Kudu system was created to combine the advantages of Hadoop and HBase and enable both effective data set analysis and fast random access. As subject of the research, performance analysis of the mentioned systems was performed. The experiment was conducted in the Amazon Web Services public cloud environment, where the cluster of nine virtual machines was configured. For research purpose, containing about billion rows fragment of "Wikipedia Page Traffic Statistics" public dataset was used. The results of the measurements confirm that the Kudu system is a promising alternative to the commonly used technologies.
引用
收藏
页码:287 / 298
页数:12
相关论文
共 50 条
  • [41] Adaptive and Scalable Caching With Erasure Codes in Distributed Cloud-Edge Storage Systems
    Liu, Kaiyang
    Peng, Jun
    Wang, Jingrong
    Huang, Zhiwu
    Pan, Jianping
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (02) : 1840 - 1853
  • [42] Performance Analysis of Ubiquitous Web Systems for SmartPhones
    Hameseder, Katrin
    Fowler, Scott
    Peterson, Anders
    PROCEEDINGS OF THE 2011 INTERNATIONAL SYMPOSIUM ON PERFORMANCE EVALUATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, 2011, : 84 - 89
  • [43] On Adaptive Distributed Storage Systems
    Rai, B. K.
    Dhoorjati, V.
    Saini, L.
    Jha, A. K.
    2015 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2015, : 1482 - 1486
  • [44] Symmetry in Distributed Storage Systems
    Thakor, Satyajit
    Chan, Terence
    Shum, Kenneth W.
    2013 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS (ISIT), 2013, : 1242 - +
  • [45] Reliability of distributed storage systems
    Zhang, Wei
    Ma, Jian-Feng
    Yang, Xiao-Yuan
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2009, 36 (03): : 480 - 485
  • [46] On Locality in Distributed Storage Systems
    Rawat, Ankit Singh
    Vishwanath, Sriram
    2012 IEEE INFORMATION THEORY WORKSHOP (ITW), 2012, : 497 - 501
  • [47] Auditing for Distributed Storage Systems
    Le, Anh
    Markopoulou, Athina
    Dimakis, Alexandros G.
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2016, 24 (04) : 2182 - 2195
  • [48] Locational Performance Analysis of Distributed Photovoltaic Systems
    Christoforidis, Georgios C.
    Panapakidis, Ioannis P.
    2017 7TH INTERNATIONAL CONFERENCE ON MODERN POWER SYSTEMS (MPS), 2017,
  • [49] Performance analysis of spatially distributed MIMO systems
    Bashar, Farhana
    Abhayapala, Thushara D.
    IET COMMUNICATIONS, 2017, 11 (04) : 566 - 575
  • [50] Modular performance analysis of distributed embedded systems
    Thiele, L
    FORMAL MODELING AND ANALYSIS OF TIMED SYSTEMS, 2005, 3829 : 1 - 1