Cluster Computing for Web-Scale Data Processing

被引:0
|
作者
Kimball, Aaron [1 ]
Michels-Slettvet, Sierra [1 ]
Bisciglia, Christophe
机构
[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
Education; Hadoop; MapReduce; Clusters; Distributed computing;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we present the design of a modem course in cluster computing and large-scale data processing. The defining differences between this and previously published designs are its focus on processing very large data sets and its use of Hadoop, an open source Java-based implementation of MapReduce and the Google File System as the platform for programming exercises. Hadoop proved to be a key element for successfully implementing structured lab activities and independent design projects. Through this course, offered at the University of Washington in 2007, we imparted new skills on our students, improving their ability to design systems capable of solving web-scale problems.
引用
收藏
页码:116 / 120
页数:5
相关论文
共 50 条
  • [31] Web-Scale Information Extraction with Vertex
    Gulhane, Pankaj
    Madaan, Amit
    Mehta, Rupesh
    Ramamirtham, Jeyashankher
    Rastogi, Rajeev
    Satpal, Sandeep
    Sengamedu, Srinivasan H.
    Tengli, Ashwin
    Tiwari, Charu
    IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 1209 - 1220
  • [33] Duplicate-Search-Based Image Annotation Using Web-Scale Data
    Wang, Xin-Jing
    Zhang, Lei
    Ma, Wei-Ying
    PROCEEDINGS OF THE IEEE, 2012, 100 (09) : 2705 - 2721
  • [34] Web-Scale Responsive Visual Search at Bing
    Hu, Houdong
    Wang, Yan
    Yang, Linjun
    Komlev, Pavel
    Huang, Li
    Chen, Xi
    Huang, Jiapei
    Wu, Ye
    Merchant, Meenaz
    Sacheti, Arun
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 359 - 367
  • [35] ParaCrawl: Web-Scale Acquisition of Parallel Corpora
    Banon, Marta
    Chen, Pinzhen
    Haddow, Barry
    Heafield, Kenneth
    Hoang, Hieu
    Espla-Gomis, Miquel
    Forcada, Mikel
    Kamran, Amir
    Kirefu, Faheem
    Koehn, Philipp
    Ortiz-Rojas, Sergio
    Pla, Leopoldo
    Ramirez-Sanchez, Gema
    Sarrias, Elsa
    Strelec, Marek
    Thompson, Brian
    Waites, William
    Wiggins, Dion
    Zaragoza, Jaume
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 4555 - 4567
  • [36] Less is More: Accurate Speech Recognition & Translation without Web-Scale Data
    Puvvada, Krishna C.
    Zelasko, Piotr
    Huang, He
    Hrinchuk, Oleksii
    Koluguri, Nithin Rao
    Dhawan, Kunal
    Majumdar, Somshubra
    Rastorgueva, Elena
    Chen, Zhehuai
    Lavrukhin, Vitaly
    Balam, Jagadeesh
    Ginsburg, Boris
    INTERSPEECH 2024, 2024, : 3964 - 3968
  • [37] Poisoning Web-Scale Training Datasets is Practical
    Carlini, Nicholas
    Jagielski, Matthew
    Choquette-Choo, Christopher A.
    Paleka, Daniel
    Pearce, Will
    Anderson, Hyrum
    Terzis, Andreas
    Thomas, Kurt
    Tramer, Florian
    45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 407 - 425
  • [38] Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs
    Shi, Jieming
    Jin, Tianyuan
    Yang, Renchi
    Xiao, Xiaokui
    Yang, Yin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (07): : 966 - 978
  • [39] Enabling Combined Software and Data Engineering at Web-Scale: The ALIGNED Suite of Ontologies
    Solanki, Monika
    Bozic, Bojan
    Freudenberg, Markus
    Kontokostas, Dimitris
    Dirschl, Christian
    Brennan, Rob
    SEMANTIC WEB - ISWC 2016, PT II, 2016, 9982 : 195 - 203
  • [40] Investigations into Library Web-Scale Discovery Services
    Vaughan, Jason
    INFORMATION TECHNOLOGY AND LIBRARIES, 2012, 31 (01) : 32 - 82