Simba: Spatial In-Memory Big Data Analysis

被引:10
|
作者
Xie, Dong [1 ]
Li, Feifei [1 ]
Yao, Bin [2 ]
Li, Gefei [2 ]
Chen, Zhongpu [2 ]
Zhou, Liang [2 ]
Guo, Minyi [2 ]
机构
[1] Univ Utah, Salt Lake City, UT 84112 USA
[2] Shanghai Jiao Tong Univ, Shanghai 200030, Peoples R China
基金
美国国家科学基金会;
关键词
Simba; Spatial data anlaysis; Big data; Distributed system;
D O I
10.1145/2996913.2996935
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present the Simba ( Spatial In-Memory Big data Analytics) system, which offers scalable and efficient in-memory spatial query processing and analytics for big spatial data. Simba natively extends the Spark SQL engine to support rich spatial queries and analytics through both SQL and DataFrame API. It enables the construction of indexes over RDDs inside the engine in order to work with big spatial data and complex spatial operations. Simba also comes with an effective query optimizer, which leverages its indexes and novel spatial-aware optimizations, to achieve both low latency and high throughput in big spatial data analysis. This demonstration proposal describes key ideas in the design of Simba, and presents a demonstration plan.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Exploration of In-Memory Computing for Big Data Analytics using Queuing Theory
    Srivastava, Riktesh
    2018 2ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2018), 2018, : 11 - 16
  • [32] Design and implementation of reconfigurable acceleration for in-memory distributed big data computing
    Hou, Junjie
    Zhu, Yongxin
    Du, Sen
    Song, Shijin
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 92 : 68 - 75
  • [33] Big data availability: Selective partial checkpointing for in-memory database queries
    Playfair, Daniel
    Trehan, Amitabh
    McLarnon, Barry
    Nikolopoulos, Dimitrios S.
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2785 - 2794
  • [34] ClimateSpark: An in-memory distributed computing framework for big climate data analytics
    Hu, Fei
    Yang, Chaowei
    Schnase, John L.
    Duffy, Daniel Q.
    Xu, Mengchao
    Bowen, Michael K.
    Lee, Tsengdar
    Song, Weiwei
    COMPUTERS & GEOSCIENCES, 2018, 115 : 154 - 166
  • [35] A Parallel Randomized Neural Network on In-memory Cluster Computing for Big Data
    Dai, Tongwu
    Li, Kenli
    Chen, Cen
    2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017,
  • [36] Massively Parallel Big Data Classification on a Programmable Processing In-Memory Architecture
    Kim, Yeseong
    Imani, Mohsen
    Gupta, Saransh
    Zhou, Minxuan
    Rosing, Tajana S.
    2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
  • [37] Distributed PARAFAC Decomposition Method Based on In-memory Big Data System
    Yang, Hye-Kyung
    Yong, Hwan-Seung
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 292 - 295
  • [38] DigitalPIM: Digital-based Processing In-Memory for Big Data Acceleration
    Imani, Mohsen
    Gupta, Saransh
    Kim, Yeseong
    Zhou, Minxuan
    Rosing, Tajana
    GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 429 - 434
  • [39] Exploiting In-memory Systems for Genomic Data Analysis
    Shah, Zeeshan Ali
    El-Kalioby, Mohamed
    Faquih, Tariq
    Shokrof, Moustafa
    Subhani, Shazia
    Alnakhli, Yasser
    Aljafar, Hussain
    Anjum, Ashiq
    Abouelhoda, Mohamed
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING, IWBBIO 2018, PT I, 2018, 10813 : 405 - 414
  • [40] Research on In-Memory Computing Model and Data Analysis
    Wu Jun
    Huang Zhixiong
    PROCEEDINGS OF 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2015), 2015, : 726 - 729