Quantifying Genomic Privacy via Inference Attack with High-Order SNV Correlations

被引:18
作者
Samani, Sahel Shariati [1 ]
Huang, Zhicong [2 ]
Ayday, Erman [3 ]
Elliot, Mark [1 ]
Fellay, Jacques [2 ]
Hubaux, Jean-Pierre [2 ]
Kutalik, Zoltan [4 ]
机构
[1] Univ Manchester, Manchester M13 9PL, Lancs, England
[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
[3] Bilkent Univ, Bilkent, Turkey
[4] Univ Lausanne Hosp, Lausanne, Switzerland
来源
2015 IEEE SECURITY AND PRIVACY WORKSHOPS (SPW) | 2015年
关键词
LINKAGE DISEQUILIBRIUM; WIDE ASSOCIATION;
D O I
10.1109/SPW.2015.21
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As genomic data becomes widely used, the problem of genomic data privacy becomes a hot interdisciplinary research topic among geneticists, bioinformaticians and security and privacy experts. Practical attacks have been identified on genomic data, and thus break the privacy expectations of individuals who contribute their genomic data to medical research, or simply share their data online. Frustrating as it is, the problem could become even worse. Existing genomic privacy breaches rely on low-order SNV (Single Nucleotide Variant) correlations. Our work shows that far more powerful attacks can be designed if high-order correlations are utilized. We corroborate this concern by making use of different SNV correlations based on various genomic data models and applying them to an inference attack on individuals' genotype data with hidden SNVs. We also show that low-order models behave very differently from real genomic data and therefore should not be relied upon for privacy-preserving solutions.
引用
收藏
页码:32 / 40
页数:9
相关论文
共 20 条
[1]  
Ayday E., 2013, P 12 ACM WORKSHOP PR, P95
[2]  
Ayday E, 2014, P 13 WORKSH PRIV EL, P11
[3]  
Blanton Marina, 2012, Computer Security - ESORICS 2012. Proceedings 17th European Symposium on Research in Computer Security, P505, DOI 10.1007/978-3-642-33167-1_29
[4]   Needles in the Haystack: Identifying Individuals Present in Pooled Genomic Data [J].
Braun, Rosemary ;
Rowe, William ;
Schaefer, Carl ;
Zhang, Jinghui ;
Buetow, Kenneth .
PLOS GENETICS, 2009, 5 (10)
[5]   A novel, privacy-preserving cryptographic approach for sharing sequencing data [J].
Cassa, Christopher A. ;
Miller, Rachel A. ;
Mandl, Kenneth D. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (01) :69-76
[6]   Routes for breaching and protecting genetic privacy [J].
Erlich, Yaniv ;
Narayanan, Arvind .
NATURE REVIEWS GENETICS, 2014, 15 (06) :409-421
[7]   Summarizing and Quantifying Multilocus Linkage Disequilibrium Patterns with Multi-Order Markov Chain Models [J].
Feng, Sheng ;
Wang, Shengchu .
JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2010, 20 (02) :441-453
[8]   Decomposing multilocus linkage disequilibrium [J].
Gorelick, R ;
Laubichler, MD .
GENETICS, 2004, 166 (03) :1581-1583
[9]   Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays [J].
Homer, Nils ;
Szelinger, Szabolcs ;
Redman, Margot ;
Duggan, David ;
Tembe, Waibhav ;
Muehling, Jill ;
Pearson, John V. ;
Stephan, Dietrich A. ;
Nelson, Stanley F. ;
Craig, David W. .
PLOS GENETICS, 2008, 4 (08)
[10]  
Humbert M., 2013, Proc. 2013 ACM SIGSAC Conf. Comput. Commun. Secur, P1141