The Performance Analysis of Distributed Storage Systems Used in Scalable Web Systems

被引:0
|
作者
Oles, Dominik [1 ]
Nowak, Ziemowit [2 ]
机构
[1] Tieto Czech Sro, 28 Rijna 3346-91, Ostrava 70200, Czech Republic
[2] Wroclaw Univ Sci & Technol, Fac Comp Sci & Management, Wybrzeze Wyspianskiego 27, PL-50370 Wroclaw, Poland
关键词
Big Data; Hadoop; HBase; Kudu;
D O I
10.1007/978-3-319-99981-4_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scalable web systems are directly related to distributed storage systems used to process large amounts of data (big data). An example of such a system is Hadoop with its many extensions supporting data storage such as SQL-on-Hadoop systems and the "Parquet" file format. Another kind of systems for storing and processing big data are NoSQL databases, such as HBase, which are used in applications requiring fast random access. The Kudu system was created to combine the advantages of Hadoop and HBase and enable both effective data set analysis and fast random access. As subject of the research, performance analysis of the mentioned systems was performed. The experiment was conducted in the Amazon Web Services public cloud environment, where the cluster of nine virtual machines was configured. For research purpose, containing about billion rows fragment of "Wikipedia Page Traffic Statistics" public dataset was used. The results of the measurements confirm that the Kudu system is a promising alternative to the commonly used technologies.
引用
收藏
页码:287 / 298
页数:12
相关论文
共 50 条
  • [21] Performance Analysis of SPA Web Systems
    Stepniak, Wojciech
    Nowak, Ziemowit
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY - ISAT 2016 - PT I, 2017, 521 : 235 - 247
  • [22] Distributed Performance Analysis of Heterogeneous Systems
    Rantzer, Anders
    49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 2682 - 2685
  • [23] Scalable, Efficient, and Policy-aware Deduplication for Primary Distributed Storage Systems
    Fingler, Henrique
    Ra, Moo-Ryong
    Panta, Rajesh
    2019 31ST INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2019), 2019, : 180 - 187
  • [24] Scalable Family of Codes with Locality and Availability for Information Repair in Distributed Storage Systems
    Farkas, Peter
    2019 4TH INTERNATIONAL CONFERENCE ON SMART AND SUSTAINABLE TECHNOLOGIES (SPLITECH), 2019, : 156 - 159
  • [25] Scalable Analysis of Scalable Systems
    Clark, Allan
    Gilmore, Stephen
    Tribastone, Mirco
    FUNDAMENTAL APPROACHES TO SOFTWARE ENGINEERING, PROCEEDINGS, 2009, 5503 : 1 - 17
  • [26] Design and testing of scalable Web-based systems with performance constraints
    Andreolini, M
    Colajanni, M
    Valente, P
    2005 Workshop on Techniques, Methodologies and Tools for Performance Evaluation of Complex Systems, Proceedings, 2005, : 15 - 25
  • [27] Scalable hierarchical locking for distributed systems
    Desai, N
    Mueller, F
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2004, 64 (06) : 708 - 724
  • [28] Opera: Scalable Simulator for Distributed Systems
    Hassanzadeh-Nazarabadi, Yahya
    Ali, Moayed Haji
    Nayal, Nazir
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (IEEE INFOCOM WKSHPS 2021), 2021,
  • [29] Scalable Symbolic Execution of Distributed Systems
    Sasnauskas, Raimondas
    Dustmann, Oscar Soria
    Kaminski, Benjamin Lucien
    Wehrle, Klaus
    Weise, Carsten
    Kowalewski, Stefan
    31ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2011), 2011, : 333 - 342
  • [30] Gaussian networks for scalable distributed systems
    Hsu, WJ
    Chung, MJ
    Hu, ZJ
    COMPUTER JOURNAL, 1996, 39 (05): : 417 - 426