Turbine: Facebook's Service Management Platform for Stream Processing

被引:23
|
作者
Mei, Yuan [1 ]
Cheng, Luwei [1 ]
Talwar, Vanish [1 ]
Levin, Michael Y. [1 ]
Jacques-Silva, Gabriela [1 ]
Simha, Nikhil [1 ]
Banerjee, Anirban [1 ]
Smith, Brian [1 ]
Williamson, Tim [1 ]
Yilmaz, Serhat [1 ]
Chen, Weitao [1 ]
Chen, Guoqiang Jerry [1 ]
机构
[1] Facebook Inc, Menlo Pk, CA 94025 USA
关键词
Stream Processing; Cluster Management;
D O I
10.1109/ICDE48307.2020.00141
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The demand for stream processing at Facebook has grown as services increasingly rely on real-time signals to speed up decisions and actions. Emerging real-time applications require strict Service Level Objectives (SLOs) with low downtime and processing lag even in the presence of failures and load variability. Addressing this challenge at Facebook scale led to the development of Turbine, a management platform designed to bridge the gap between the capabilities of the existing generalpurpose cluster management frameworks and Facebook's stream processing requirements. Specifically, Turbine features a fast and scalable task scheduler; an efficient predictive auto scaler; and an application update mechanism that provides fault-tolerance, atomicity, consistency, isolation and durability. Turbine has been in production for over three years, and one of the core technologies that enabled a booming growth of stream processing at Facebook. It is currently deployed on clusters spanning tens of thousands of machines, managing several thousands of streaming pipelines processing terabytes of data per second in real time. Our production experience has validated Turbine's effectiveness: its task scheduler evenly balances workload fluctuation across clusters; its auto scaler effectively and predictively handles unplanned load spikes; and the application update mechanism consistently and efficiently completes high scale updates within minutes. This paper describes the Turbine architecture, discusses the design choices behind it, and shares several case studies demonstrating Turbine capabilities in production.
引用
收藏
页码:1591 / 1602
页数:12
相关论文
共 50 条
  • [21] Managing Urban Resilience: Stream Processing Platform for Responsive Cities
    Klein B.
    Koenig R.
    Schmitt G.
    Informatik-Spektrum, 2017, 40 (1) : 35 - 45
  • [22] C-NMSP: A High Performance Network Management Service Platform for Complex Event Processing
    Ding Kun
    Zhang Xiaoyi
    Deng Bo
    FGCN: PROCEEDINGS OF THE 2008 SECOND INTERNATIONAL CONFERENCE ON FUTURE GENERATION COMMUNICATION AND NETWORKING, VOLS 1 AND 2, 2008, : 5 - 8
  • [23] Stream data management based on integration of a stream processing engine and databases
    Kitagawa, Hiroyuki
    Watanabe, Yousuke
    2007 IFIP INTERNATIONAL CONFERENCE ON NETWORK AND PARALLEL COMPUTING WORKSHOPS, PROCEEDINGS, 2007, : 18 - +
  • [24] Service management platform: The next step in management tools
    Rodosek, GD
    SERVICES MANAGEMENT IN INTELLIGENT NETWORKS, PROCEEDINGS, 2000, 1960 : 59 - 70
  • [25] Optimizing RDF Stream Processing for Uncertainty Management
    Keskisarkka, Robin
    Blomqvist, Eva
    Hartig, Olaf
    FURTHER WITH KNOWLEDGE GRAPHS, 2021, 53 : 118 - 132
  • [26] Dynamic Cloud Management for Efficient Stream Processing
    Foschini, Luca
    Kantarci, Burak
    Corradi, Antonio
    Mouftah, Hussein T.
    2013 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2013,
  • [27] Replacement of Facebook's Quantum Platform in Descriptive Geometry Teaching
    de Lima, Alvaro Jose Rodrigues
    do Carmo, Leticia Augusto Mello
    de Lima, Luciana Guimaraes Rodrigues
    Haguenauer, Cristina Jasbinchek
    REVISTA EDUCAONLINE, 2019, 13 (03): : 78 - 94
  • [28] Leveraging Facebook's Advertising Platform to Monitor Stocks of Migrants
    Zagheni, Emilio
    Weber, Ingmar
    Gummadi, Krishna
    POPULATION AND DEVELOPMENT REVIEW, 2017, 43 (04) : 721 - +
  • [29] Monitoring of the Venezuelan exodus through Facebook's advertising platform
    Palotti, Joao
    Adler, Natalia
    Morales-Guzman, Alfredo
    Villaveces, Jeffrey
    Sekara, Vedran
    Herranz, Manuel Garcia
    Al-Asad, Musa
    Weber, Ingmar
    PLOS ONE, 2020, 15 (02):
  • [30] Network Service Connection Management Mechanism and Network Service Platform
    Tsai, Yi-Hsing
    Huang, Wei-Feng
    Wu, Yun-Ei
    Hsu, Jung-Kuang
    2013 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (ICMA), 2013, : 1599 - 1604