SubZero: A Fine-Grained Lineage System for Scientific Databases

被引:0
|
作者
Wu, Eugene [1 ]
Madden, Samuel [1 ]
Stonebraker, Michael [1 ]
机构
[1] MIT, CSAIL, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data lineage is a key component of provenance that helps scientists track and query relationships between input and output data. While current systems readily support lineage relationships at the file or data array level, finer-grained support at an array-cell level is impractical due to the lack of support for user defined operators and the high runtime and storage overhead to store such lineage. We interviewed scientists in several domains to identify a set of common semantics that can be leveraged to efficiently store fine-grained lineage. We use the insights to define lineage representations that efficiently capture common locality properties in the lineage data, and a set of APIs so operator developers can easily export lineage information from user defined operators. Finally, we introduce two benchmarks derived from astronomy and genomics, and show that our techniques can reduce lineage query costs by up to 10x while incuring substantially less impact on workflow runtime and storage.
引用
收藏
页码:865 / 876
页数:12
相关论文
共 50 条
  • [41] FTMS: A Fine-grained Task Monitoring System on Spark
    Yang, Mang-Mang
    Liang, Yi
    Jin, Yi
    Hou, Ying
    Wang, Hai-hua
    Fan, Ming-Lu
    2016 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SECURITY (CSIS 2016), 2016, : 489 - 494
  • [42] CloudSocket: Fine-Grained Power Sensing System for Datacenters
    Lee, Seil
    Kim, Hanjoo
    Park, Seongsik
    Kim, Seijoon
    Choe, Hyeokjun
    Yoon, Sungroh
    IEEE ACCESS, 2018, 6 : 49601 - 49610
  • [43] A Lightweight and Fine-grained File System Sandboxing Framework
    Bijlani, Ashish
    Ramachandran, Umakishore
    9TH ASIA-PACIFIC SYSTEMS WORKSHOP 2018 (APSYS'18), 2018,
  • [44] GAIA: A Fine-grained Multimedia Knowledge Extraction System
    Li, Manling
    Zareian, Alireza
    Lin, Ying
    Pan, Xiaoman
    Whitehead, Spencer
    Chen, Brian
    Wu, Bo
    Ji, Heng
    Chang, Shih-Fu
    Voss, Clare
    Napierski, Daniel
    Freedman, Marjorie
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): SYSTEM DEMONSTRATIONS, 2020, : 77 - 86
  • [45] Fine-grained uncertainty relation for open quantum system
    韩尚斌
    李帅杰
    张精俊
    冯俊
    Chinese Physics B, 2021, 30 (06) : 167 - 172
  • [46] Mixing system for highly concentrated fine-grained suspensions
    Moravec, Jiri
    Jirout, Tomas
    Rieger, Frantisek
    Kratky, Lukas
    POLISH JOURNAL OF CHEMICAL TECHNOLOGY, 2009, 11 (04) : 52 - 56
  • [47] Historage: Fine-grained version control system for Java
    Hata, Hideaki
    Mizuno, Osamu
    Kikuno, Tohru
    IWPSE-EVOL'11 - Proceedings of the 12th International Workshop on Principles on Software Evolution, 2011, : 96 - 100
  • [48] Ownership: A Distributed Futures System for Fine-Grained Tasks
    Wang, Stephanie
    Liang, Eric
    Oakes, Edward
    Hindman, Ben
    Luan, Frank
    Cheng, Audrey
    PROCEEDINGS OF THE 18TH USENIX SYMPOSIUM ON NETWORKED SYSTEM DESIGN AND IMPLEMENTATION, 2021, : 671 - 686
  • [49] Fine-grained composition of distributed sensor system infrastructure
    Subramonian, V
    Gill, C
    2005 IEEE MTT-S INTERNATIONAL MICROWAVE SYMPOSIUM, VOLS 1-4, 2005, : 377 - 380
  • [50] Resilient modulus estimation system for fine-grained soils
    Han, Yuh-Puu
    Petry, Thomas M.
    Richardson, David N.
    GEOLOGY AND PROPERTIES OF EARTH MATERIALS 2006, 2006, (1967): : 69 - +