A Multimodal Approach to Synthetic Personal Data Generation with Mixed Modelling: Bayesian Networks, GAN's and Classification Models

被引:0
|
作者
Deeva, Irina [1 ]
Mossyayev, Andrey [1 ]
Kalyuzhnaya, Anna, V [1 ]
机构
[1] ITMO Univ, St Petersburg, Russia
关键词
Synthetic personal data; Bayesian networks; Generative adversarial networks; Multimodal approach; Classification models;
D O I
10.1007/978-3-030-94822-1_55
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Personal data is multimodal, as it is represented by various types of data - tabular data, images, text data. In this regard, the generation of synthetic personal data requires a large number of interconnected datasets, but it is often very difficult to collect tabular data, images or texts for the same people. The problem of having interconnected datasets can be solved by separating the models to generate each type of data and combining them into a single model pipeline. This paper presents a multimodal approach to generating synthetic personal data of a social network user, which allows generating socio-demographic information in the user's profile (tabular data), an image of the user's avatar and content images that correlates with the user's interests. The multimodal approach is based on the combined use of Bayesian networks, generative adversarial networks and discriminative model. This approach, due to the independent training of models, allows us to solve the problem of the presence of interconnected data sets (info + photos) and can also be used for example to anonymize medical data. A quantitative assessment shows that the obtained synthetic profiles are quite plausible.
引用
收藏
页码:847 / 859
页数:13
相关论文
共 45 条
  • [1] Synthetic data generation with probabilistic Bayesian Networks
    Gogoshin, Grigoriy
    Branciamore, Sergio
    Rodin, Andrei S.
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (06) : 8603 - 8621
  • [2] Flexible modelling of random effects in linear mixed models - A Bayesian approach
    Ho, Remus K. W.
    Hu, Inchi
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (03) : 1347 - 1361
  • [3] Hierarchical Bayesian networks: An approach to classification and learning for structured data
    Gyftodimos, E
    Flach, PA
    METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3025 : 291 - 300
  • [4] Learning Bayesian Networks: A Copula Approach for Mixed-Type Data
    Castelletti, Federico
    PSYCHOMETRIKA, 2024, 89 (02) : 658 - 686
  • [5] Antenna Design Using a GAN-Based Synthetic Data Generation Approach
    Noakoasteen, Oameed
    Vijayamohanan, Jayakrishnan
    Gupta, Arjun
    Christodoulou, Christos
    IEEE OPEN JOURNAL OF ANTENNAS AND PROPAGATION, 2022, 3 : 488 - 494
  • [6] A Tale of Two Methods: Unveiling the limitations of GAN and the Rise of Bayesian Networks for Synthetic Network Traffic Generation
    Schoen, Adrien
    Blanc, Gregory
    Gimenez, Pierre-Francois
    Han, Yufei
    Majorczyk, Frederic
    Me, Ludovic
    9TH IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS, EUROS&PW 2024, 2024, : 273 - 286
  • [7] Bayesian Data-Driven approach enhances synthetic flood loss models
    Sairam, Nivedita
    Schroeter, Kai
    Carisi, Francesca
    Wagenaar, Dennis
    Domeneghetti, Alessio
    Molinari, Daniela
    Brill, Fabio
    Priest, Sally
    Viavattene, Christophe
    Merz, Bruno
    Kreibich, Heidi
    ENVIRONMENTAL MODELLING & SOFTWARE, 2020, 132
  • [8] Synthetic Data Generation with Large Language Models for Text Classification: Potential and Limitations
    Li, Zhuoyan
    Zhu, Hangxiao
    Lu, Zhuoran
    Yin, Ming
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 10443 - 10461
  • [9] A semiparametric Bayesian approach to generalized partial linear mixed models for longitudinal data
    Tang, Nian-Sheng
    Duan, Xing-De
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (12) : 4348 - 4365
  • [10] BClass: A Bayesian approach based on mixture models for clustering and classification of heterogeneous biological data
    Medrano-Soto, A
    Christen, JA
    Collado-Vides, J
    JOURNAL OF STATISTICAL SOFTWARE, 2005, 13 (02): : 1 - 18