Building and Operating a Large-Scale Enterprise Data Analytics Platform

被引:6
|
作者
Bauer, Daniel [1 ]
Froese, Florian [1 ]
Garces-Erice, Luis [1 ]
Giblin, Chris [1 ]
Labbi, Abdel [1 ]
Nagy, Zoltan A. [1 ]
Pardon, Niels [1 ]
Rooney, Sean [1 ]
Urbanetz, Peter [1 ]
Vetsch, Pascal [1 ]
Wespi, Andreas [1 ]
机构
[1] IBM Res Europe, Saumerstr 4, CH-8803 Ruschlikon, Switzerland
关键词
Hybrid cloud; Datalake; Storage; Ingestion; SQL/Hadoop; Governance;
D O I
10.1016/j.bdr.2020.100181
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the last three years we have been running a large-scale data processing platform for applying analytics to corporate data at scale on an OpenStack private cloud instance. Our platform makes a wide variety of corporate data assets, such as sales, marketing, customer information, as well as data from less conventional sources such as weather, news and social media available for analytics purposes to hundreds of globally distributed teams across the company. We control every layer in the stack from the processing engines down to the hardware. Here we report our experiences in building and operating such a system. We describe our technical choices and describe how they evolved as we observed the actual workloads created by users. (C) 2020 The Authors. Published by Elsevier Inc.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] An Open Transportation Network Resilience Analytics Platform for Large-Scale Urban Accessibility Analysis
    Castro, Edgar
    Wang, Qi
    Akhavan, Armin
    CONSTRUCTION RESEARCH CONGRESS 2018: INFRASTRUCTURE AND FACILITY MANAGEMENT, 2018, : 213 - 221
  • [42] Large-scale data processing platform for laser absorption tomography
    Zhou, Minqiu
    Zhang, Rui
    Chen, Yuan
    Fu, Yalei
    Xia, Jiangnan
    Upadhyay, Abhishek
    Liu, Chang
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (12)
  • [43] RT-DAP: A Real-Time Data Analytics Platform for Large-scale Industrial Process Monitoring and Control
    Han, Song
    Gong, Tao
    Nixon, Mark
    Rotvold, Eric
    Lam, Kam-Yiu
    Ramamritham, Krithi
    2018 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL INTERNET (ICII 2018), 2018, : 59 - 68
  • [44] Parallel Approach and Platform for Large-scale Web Data Extraction
    Shen, Yi
    Shi, Shengsheng
    Wang, Haitao
    Wei, Wu
    Yuan, Chunfeng
    Huang, Yihua
    2013 INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2013, : 192 - 196
  • [45] THE FUTURE ROLE OF LARGE-SCALE ENTERPRISE
    Yntema, Theodore O.
    JOURNAL OF POLITICAL ECONOMY, 1941, 49 (06) : 833 - 848
  • [46] FINANCIAL CONTROL OF LARGE-SCALE ENTERPRISE
    Bell, James Washington
    AMERICAN ECONOMIC REVIEW, 1939, 29 (01): : 109 - 117
  • [47] SYSTEMIC MANAGEMENT OF LARGE-SCALE ENTERPRISE
    BLOTLEFEVRE, E
    DIRECTION ET GESTION, 1977, 13 (06): : 19 - 28
  • [48] COMPUTERS IN A LARGE-SCALE FARMING ENTERPRISE
    HAYES, RF
    VETERINARY RECORD, 1977, 101 (11) : 211 - 211
  • [49] Efficient Large-scale Medical Data (eHealth Big Data) Analytics in Internet of Things
    Plageras, Andreas P.
    Stergiou, Christos
    Kokkonis, George
    Psannis, Kostas E.
    Ishibashi, Yutaka
    Kim, Byung-Gyu
    Gupta, B. Brij
    2017 IEEE 19TH CONFERENCE ON BUSINESS INFORMATICS (CBI), VOL 2, 2017, 2 : 21 - 27
  • [50] DIPS LARGE-SCALE OPERATING SYSTEM
    TAKAMURA, S
    OHSHIMA, Y
    TOH, T
    ITOH, Y
    OGINO, S
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1980, 28 (3-4): : 161 - 175