Individual privacy levels in query-based anonymization

被引:1
|
作者
Schiegg, Sascha [1 ]
Strohmeier, Florian [1 ]
Gerl, Armin [2 ]
Kosch, Harald [1 ]
机构
[1] Univ Passau, Passau, Germany
[2] HM Univ Appl Sci Munich, Munich, Germany
关键词
Data privacy; Query processing; Data warehouses;
D O I
10.1145/3664476.3670920
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Artificial intelligence systems such as Large Language Models (LLM) derive their knowledge from large datasets. Systems like ChatGPT therefore rely on shared data to train on. For companies, releasing data to the public domain requires anonymization as soon as an individual is identifiable. While there are several privacy models that guarantee a certain level of distortion applied to a dataset, to mitigate re-identification, e.g. with k-anonymity, the required level is generally defined by the data processor. We propose the idea of combining individual privacy levels defined by the data subjects themselves with a privacy language, such as the Layered Privacy Language (LPL) [10], to get a more fine-grained understanding of the effectively required privacy level. Queries that target subsets of the dataset to be released can only benefit from lower privacy requirements set by data subjects, as these response subsets may do not contain users with high privacy requirements, which can then lead to more utility. By analyzing the results of different queries to a privacy-aware data-transforming database system, we demonstrate the characteristics required for this assumption to be truly effective. For a more realistic evaluation, we also consider changes in the underlying data sources.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] A query-based approach for test selection in diagnosis
    François Gagnon
    Babak Esfandiari
    Artificial Intelligence Review, 2008, 29
  • [22] Query-based HMM training method for ASR
    Kyung, Y
    Jung, J
    Moon, S
    ELECTRONICS LETTERS, 2003, 39 (16) : 1222 - 1223
  • [23] Metamorphic Relation Patterns for Query-Based Systems
    Segura, Sergio
    Duran, Amador
    Troya, Javier
    Ruiz-Cortes, Antonio
    2019 IEEE/ACM 4TH INTERNATIONAL WORKSHOP ON METAMORPHIC TESTING (MET 2019), 2019, : 24 - 31
  • [24] Query-based ontology approach for semantic search
    Hsieh, Tung-Cheng
    Tsai, Kun-Hua
    Chen, Ching-Lung
    Lee, Ming-Che
    Chiu, Ti-Kai
    Wang, Tzone-I
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 2970 - 2975
  • [25] Research on Query-based Automatic Summarization of Webpage
    Chen, Zhimin
    Shen, Jie
    2009 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL I, 2009, : 173 - 176
  • [26] Clustering Based Anonymization For Privacy Preservation
    Ghate, Rashmi B.
    Ingle, Rasika
    2015 INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING (ICPC), 2015,
  • [27] A query-based approach for test selection in diagnosis
    Gagnon, Francois
    Esfandiari, Babak
    ARTIFICIAL INTELLIGENCE REVIEW, 2008, 29 (3-4) : 249 - 263
  • [28] Intertopic Information Mining for Query-Based Summarization
    Ouyang, You
    Li, Wenjie
    Li, Sujian
    Lu, Qin
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (05): : 1062 - 1072
  • [29] Counter Deanonymization Query: H -index Based k-Anonymization Privacy Protection for Social Networks
    Gao, Jianliang
    Song, Bo
    Chen, Zheng
    Ke, Weimao
    Ding, Wanying
    Hu, Xiaohua
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 809 - 812
  • [30] Contour refinement by enhanced query-based learning
    Huang, SJ
    Hung, CC
    ISCAS 96: 1996 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - CIRCUITS AND SYSTEMS CONNECTING THE WORLD, VOL 2, 1996, : 616 - 619