Large Scale Analytics of Vector plus Raster Big Spatial Data

被引:11
|
作者
Eldawy, Ahmed [1 ]
Niu, Lyuye [1 ]
Haynes, David [2 ]
Su, Zhiba [1 ]
机构
[1] Univ Calif Riverside, Comp Sci & Engn, Riverside, CA 92521 USA
[2] Univ Minnesota Twin Cities, Program Hlth Dispar, Minneapolis, MN USA
基金
美国国家卫生研究院;
关键词
Big Spatial Data; Raster; Vector; Satellite; Clipping;
D O I
10.1145/3139958.3140042
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Significant increases in the volume of big spatial data have driven researchers and practitioners to build specialized systems to process and analyze this data. Existing efforts focus on either big raster data, e.g., remote sensing data or medical images, or big vector data, e.g., geotagged tweets or trajectories. However, when raster and vector data mix, one dataset must be converted to the other representation requiring vector-to-raster or raster-to-vector transformation before processing, which is extremely inefficient for large datasets. In this paper, we advocate a third approach that mixes the raw representations of both vector and raster data in the query processor. As a case study, we apply this to the zonal statistics problem, which computes the statistics over a raster layer for each polygon in a vector layer. We propose a novel method, called Scanline method, which does not require a conversion between raster and vector. Experimental evaluation on real datasets as large as 840 billion pixels shows up to three orders of magnitude speedup over the baseline methods.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Raptor: Large Scale Processing of Big Raster plus Vector Data
    Singla, Samriddhi
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2905 - 2907
  • [2] Raptor: Large Scale Analysis of Big Raster and Vector Data
    Singla, Samriddhi
    Eldawy, Ahmed
    Alghamdi, Rami
    Mokbel, Mohamed F.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (12): : 1950 - 1953
  • [3] Enabling Efficient Distributed Spatial Join on Large Scale Vector-Raster Data Lakes
    Villarroya, Sebastian
    Viqueira, Jose R. R.
    Cotos, Jose M.
    Taboada, Jose A.
    IEEE ACCESS, 2022, 10 : 29406 - 29418
  • [4] Raptor Zonal Statistics: Fully Distributed Zonal Statistics of Big Raster plus Vector Data
    Singla, Samriddhi
    Eldawy, Ahmed
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 571 - 580
  • [5] Distributed Zonal Statistics of Big Raster and Vector Data
    Singla, Samriddhi
    Eldawy, Ahmed
    26TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2018), 2018, : 536 - 539
  • [6] Performance Evaluation of Big Data Frameworks for Large-Scale Data Analytics
    Veiga, Jorge
    Exposito, Roberto R.
    Pardo, Xoan C.
    Taboada, Guillermo L.
    Tourino, Juan
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 424 - 431
  • [7] Big Data, Big Results: Knowledge Discovery in Output from Large-Scale Analytics
    McCormick, Tyler H.
    Ferrell, Rebecca
    Karr, Alan F.
    Ryan, Patrick B.
    STATISTICAL ANALYSIS AND DATA MINING, 2014, 7 (05) : 404 - 412
  • [8] Big issues for big data: challenges for critical spatial data analytics
    Brunsdon, Chris
    Comber, Alexis
    JOURNAL OF SPATIAL INFORMATION SCIENCE, 2020, (21): : 89 - 98
  • [9] Distributed optimization over large-scale systems for big data analytics
    Shahbazian, Reza
    4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH, 2021, 19 (02): : 309 - 310
  • [10] Distributed optimization over large-scale systems for big data analytics
    Reza Shahbazian
    4OR, 2021, 19 : 309 - 310