Protecting genomic data analytics in the cloud: state of the art and opportunities

被引:0
|
作者
Haixu Tang
Xiaoqian Jiang
Xiaofeng Wang
Shuang Wang
Heidi Sofia
Dov Fox
Kristin Lauter
Bradley Malin
Amalio Telenti
Li Xiong
Lucila Ohno-Machado
机构
[1] Indiana University,School of Informatics and Computing
[2] University of California San Diego,Department of Biomedical Informatics
[3] National Human Genome Research Institute,School of Law
[4] University of San Diego,Department of Biomedical Informatics, School of Medicine
[5] Microsoft Research,Department of Mathematics and Computer Science
[6] Vanderbilt University,undefined
[7] The J. Craig Venter Institute,undefined
[8] Emory University,undefined
来源
关键词
Edit Distance; Data Owner; Public Cloud; Cryptographic Protocol; Homomorphic Encryption;
D O I
暂无
中图分类号
学科分类号
摘要
The outsourcing of genomic data into public cloud computing settings raises concerns over privacy and security. Significant advancements in secure computation methods have emerged over the past several years, but such techniques need to be rigorously evaluated for their ability to support the analysis of human genomic data in an efficient and cost-effective manner. With respect to public cloud environments, there are concerns about the inadvertent exposure of human genomic data to unauthorized users. In analyses involving multiple institutions, there is additional concern about data being used beyond agreed research scope and being prcoessed in untrused computational environments, which may not satisfy institutional policies. To systematically investigate these issues, the NIH-funded National Center for Biomedical Computing iDASH (integrating Data for Analysis, ‘anonymization’ and SHaring) hosted the second Critical Assessment of Data Privacy and Protection competition to assess the capacity of cryptographic technologies for protecting computation over human genomes in the cloud and promoting cross-institutional collaboration. Data scientists were challenged to design and engineer practical algorithms for secure outsourcing of genome computation tasks in working software, whereby analyses are performed only on encrypted data. They were also challenged to develop approaches to enable secure collaboration on data from genomic studies generated by multiple organizations (e.g., medical centers) to jointly compute aggregate statistics without sharing individual-level records. The results of the competition indicated that secure computation techniques can enable comparative analysis of human genomes, but greater efficiency (in terms of compute time and memory utilization) are needed before they are sufficiently practical for real world environments.
引用
收藏
相关论文
共 50 条
  • [1] Protecting genomic data analytics in the cloud: state of the art and opportunities
    Tang, Haixu
    Jiang, Xiaoqian
    Wang, Xiaofeng
    Wang, Shuang
    Sofia, Heidi
    Fox, Dov
    Lauter, Kristin
    Malin, Bradley
    Telenti, Amalio
    Xiong, Li
    Ohno-Machado, Lucila
    BMC MEDICAL GENOMICS, 2016, 9 : 1 - 9
  • [2] Data analytics in pharmaceutical supply chains: state of the art, opportunities, and challenges
    Nguyen, Angie
    Lamouri, Samir
    Pellerin, Robert
    Tamayo, Simon
    Lekens, Beranger
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2022, 60 (22) : 6888 - 6907
  • [3] Big data analytics in smart grids: state-of-the-art, challenges, opportunities, and future directions
    Bhattarai, Bishnu P.
    Paudyal, Sumit
    Luo, Yusheng
    Mohanpurkar, Manish
    Cheung, Kwok
    Tonkoski, Reinaldo
    Hovsapian, Rob
    Myers, Kurt S.
    Zhang, Rui
    Zhao, Power
    Manic, Milos
    Zhang, Song
    Zhang, Xiaping
    IET SMART GRID, 2019, 2 (02) : 141 - 154
  • [4] Multi-Tenant Cloud Data Services: State-of-the-Art, Challenges and Opportunities
    Narasayya, Vivek
    Chaudhuri, Surajit
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 2465 - 2473
  • [5] Big Data Based Security Analytics for Protecting Virtualized Infrastructures in Cloud Computing
    Thu Yein Win
    Tianfield, Huaglory
    Mair, Quentin
    IEEE TRANSACTIONS ON BIG DATA, 2018, 4 (01) : 11 - 25
  • [6] Cloud-based interactive analytics for terabytes of genomic variants data
    Pan, Cuiping
    McInnes, Gregory
    Deflaux, Nicole
    Snyder, Michael
    Bingham, Jonathan
    Datta, Somalee
    Tsao, Philip S.
    BIOINFORMATICS, 2017, 33 (23) : 3709 - 3715
  • [7] Protecting Privacy From Aerial photography: State of the Art, Opportunities, and Challenges
    Jiang, Bin
    Yang, Jiachen
    Song, Houbing
    IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2020, : 799 - 804
  • [8] Computational intelligence and state-of-the-art data analytics
    Yeager, William J.
    Morin, Jean-Henry
    Proceedings of the Annual Hawaii International Conference on System Sciences, 2021, 2020-January : 6922 - 6923
  • [9] The State-of-the-art of Social,Mobility,Analytics and Cloud Computing An Empirical Analysis
    Dewan, Bhushan
    Jena, Soumya Ranjan
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,
  • [10] Unified Data Analytics: State-of-the-art and Open Problems
    Kaoudi, Zoi
    Quiane-Ruiz, Jorge-Arnulfo
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (12): : 3778 - 3781