[Demo] Low-latency Spark Queries on Updatable Data

被引:4
|
作者
Uta, Alexandru [1 ,2 ]
Ghit, Bogdan [2 ]
Dave, Ankur [3 ]
Boncz, Peter [4 ]
机构
[1] Vrije Univ Amsterdam, Amsterdam, Netherlands
[2] Databricks, Amsterdam, Netherlands
[3] Univ Calif Berkeley, Berkeley, CA USA
[4] CWI Amsterdam, Amsterdam, Netherlands
关键词
D O I
10.1145/3299869.3320227
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As data science gets deployed more and more into operational applications, it becomes important for data science frameworks to be able to perform computations in interactive, sub-second time. Indexing and caching are two key techniques that can make interactive query processing on large datasets possible. In this demo, we show the design, implementation and performance of a new indexing abstraction in Apache Spark, called the Indexed DataFrame. This is a cached DataFrame that incorporates an index to support fast lookup and join operations, and supports updates with multi-version concurrency. We demonstrate the Indexed Dataframe on a social network dataset using microbenchmarks and real-world graph processing queries, in datasets that are continuously growing.
引用
收藏
页码:2009 / 2012
页数:4
相关论文
共 50 条
  • [31] CINTIA: a Distributed, Low-Latency Index for Big Interval Data
    Mavlyutov, Ruslan
    Cudre-Mauroux, Philippe
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 619 - 628
  • [32] Orthros: A Low-Latency PRF
    Banik, Subhadeep
    Isobe, Takanori
    Liu, Fukang
    Minematsu, Kazuhiko
    Sakamoto, Kosei
    IACR TRANSACTIONS ON SYMMETRIC CRYPTOLOGY, 2021, 2021 (01) : 37 - 77
  • [33] Low-latency query compilation
    Henning Funke
    Jan Mühlig
    Jens Teubner
    The VLDB Journal, 2022, 31 : 1171 - 1184
  • [34] Low-latency query compilation
    Funke, Henning
    Muehlig, Jan
    Teubner, Jens
    VLDB JOURNAL, 2022, 31 (06): : 1171 - 1184
  • [35] Low-Latency Scheduling in MPTCP
    Hurtig, Per
    Grinnemo, Karl-Johan
    Brunstrom, Anna
    Ferlin, Simone
    Alay, Ozgu
    Kuhn, Nicolas
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (01) : 302 - 315
  • [36] Low-Latency Handshake Join
    Roy, Pratanu
    Teubner, Jens
    Gemulla, Rainer
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (09): : 709 - 720
  • [37] Low-latency Mobile Data Collection for Wireless Rechargeable Sensor Networks
    Wang, Cong
    Li, Ji
    Yang, Yuanyuan
    2015 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2015, : 6524 - 6529
  • [38] Distributed Low-Latency Data Aggregation Scheduling in Wireless Sensor Networks
    Bagaa, Miloud
    Younis, Mohamed
    Djenouri, Djamel
    Derhab, Abdelouahid
    Badache, Nadjib
    ACM TRANSACTIONS ON SENSOR NETWORKS, 2015, 11 (03)
  • [39] DPaxos: Managing Data Closer to Users for Low-Latency and Mobile Applications
    Nawab, Faisal
    Agrawal, Divyakant
    El Abbadi, Amr
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1221 - 1236
  • [40] Low latency analytics for streaming traffic data with Apache Spark
    Maarala, Altti Ilari
    Rautiainen, Mika
    Salmi, Miikka
    Pirttikangas, Susanna
    Riekki, Jukka
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2855 - 2858