[Demo] Low-latency Spark Queries on Updatable Data

被引:4
|
作者
Uta, Alexandru [1 ,2 ]
Ghit, Bogdan [2 ]
Dave, Ankur [3 ]
Boncz, Peter [4 ]
机构
[1] Vrije Univ Amsterdam, Amsterdam, Netherlands
[2] Databricks, Amsterdam, Netherlands
[3] Univ Calif Berkeley, Berkeley, CA USA
[4] CWI Amsterdam, Amsterdam, Netherlands
关键词
D O I
10.1145/3299869.3320227
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As data science gets deployed more and more into operational applications, it becomes important for data science frameworks to be able to perform computations in interactive, sub-second time. Indexing and caching are two key techniques that can make interactive query processing on large datasets possible. In this demo, we show the design, implementation and performance of a new indexing abstraction in Apache Spark, called the Indexed DataFrame. This is a cached DataFrame that incorporates an index to support fast lookup and join operations, and supports updates with multi-version concurrency. We demonstrate the Indexed Dataframe on a social network dataset using microbenchmarks and real-world graph processing queries, in datasets that are continuously growing.
引用
收藏
页码:2009 / 2012
页数:4
相关论文
共 50 条
  • [1] Low-Latency Compilation of SQL Queries to Machine Code
    Funke, Henning
    Teubner, Jens
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (12): : 2691 - 2694
  • [2] Demo Abstract: Towards In-Network Processing for Low-Latency Industrial Control
    Rueth, Jan
    Glebke, Rene
    Ulmen, Tanja
    Wehrle, Klaus
    IEEE INFOCOM 2018 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2018,
  • [3] An FPGA-Based Low-Latency Network Processing for Spark Streaming
    Nakamura, Kohei
    Hayashi, Ami
    Matsutani, Hiroki
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2410 - 2415
  • [4] Towards Low-Latency Big Data Infrastructure at Sangfor
    Chen, Fei
    Yan, Zhengzheng
    Gu, Liang
    EMERGING INFORMATION SECURITY AND APPLICATIONS, EISA 2022, 2022, 1641 : 37 - 54
  • [5] Photonic bandgap fibres for low-latency data transmission
    Richardson, D. J.
    Chen, Y.
    Wheeler, N. V.
    Hayes, J. R.
    Bradley, T.
    Liu, Z.
    Sandoghchi, S. R.
    Jasion, G. T.
    Bradley, T.
    Fokoua, E. Numkam
    Gray, D. R.
    Slavik, R.
    Jung, Y.
    Wong, N. H. L.
    Poletti, F.
    Petrovich, M. N.
    ECOC 2015 41ST EUROPEAN CONFERENCE ON OPTICAL COMMUNICATION, 2015,
  • [6] Fragola: Low-Latency Transactions in Distributed Data Stores
    Gottesman, Yonatan
    Bergman, Aran
    Bortnikov, Edward
    Hillel, Eshcar
    Keidar, Idit
    Shacham, Ohad
    PROCEEDINGS OF THE 2017 SYMPOSIUM ON CLOUD COMPUTING (SOCC '17), 2017, : 642 - 642
  • [7] Low-latency trading
    Hasbrouck, Joel
    Saar, Gideon
    JOURNAL OF FINANCIAL MARKETS, 2013, 16 (04) : 646 - 679
  • [8] TurboStream: Towards Low-Latency Data Stream Processing
    Wu, Song
    Liu, Mi
    Ibrahim, Shadi
    Jin, Hai
    Gu, Lin
    Chen, Fei
    Liu, Zhiyi
    2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 983 - 993
  • [9] Majority Approximators for Low-Latency Data Bus Inversion
    Pae, Sung-il
    Kwon, Kon-Woo
    ELECTRONICS, 2022, 11 (20)
  • [10] Low-Latency Wireless Data Transfer for Motion Control
    de Boeij, Jeroen
    Haazen, Maarten
    Smulders, Peter
    Lomonova, Elena
    JOURNAL OF CONTROL SCIENCE AND ENGINEERING, 2009, 2009