Dynamic Data Protection System for Open Big Data Environment

被引:0
|
作者
Tu Y.-F. [1 ,2 ]
Niu J.-H. [2 ]
Wang D.-Z. [1 ,2 ]
Gao H. [2 ]
Xu J. [2 ]
Hong K. [2 ]
Yang F. [2 ]
机构
[1] State Key Laboratory of Mobile Network and Mobile Multimedia Technology, ZTE Corporation, Shenzhen
[2] ZTE Corporation, Nanjing
来源
Ruan Jian Xue Bao/Journal of Software | 2023年 / 34卷 / 03期
关键词
big data; data masking; dynamic data masking; query dependency; SQL rewriting;
D O I
10.13328/j.cnki.jos.006783
中图分类号
学科分类号
摘要
Big data has become a national basic strategic resource, and the opening and sharing of data is the core of China’s big data strategy. Cloud native technology and lake-house architecture are reconstructing the big data infrastructure and promoting data sharing and value dissemination. The development of big data industry and technology require stronger data security and data sharing capabilities. However, data security in an open environment has become a bottleneck, which restricts the development and utilization of big data technology. The issues of data security and privacy protection have become increasingly prominent both in the open source big data ecosystem and the commercial big data system. Dynamic data protection system under the open big data environment is now facing challenges of data availability, processing efficiency and system scalability and etc. This study proposes a dynamic data protection system BDMasker for the open big data environment. Through a precise query analysis and query rewriting technology based on the query dependency model, it can accurately perceive but not change the original business request, which indicates that the whole process of dynamic desensitization has zero impact on the business. Furthermore, its multi-engine-oriented unified security strategy framework realizes the vertical expansion of dynamic data protection capabilities and the horizontal expansion among multiple computing engines. The distributed computing capability of the big data execution engine can be used to improve the data protection processing performance of the system. The experimental results show that the precise SQL analysis and rewriting technology proposed by BDMasker is effective, the system has good scalability and performance, and the overall performance fluctuates within 3% in the TPC-DS and YCSB benchmark tests. © 2023 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:1213 / 1235
页数:22
相关论文
共 23 条
  • [1] Qian WJ, Shen QN, Wu PF, Dong CT, Wu ZH., Research progress on privacy-preserving techniques in big data computing environment, Chinese Journal of Computers, 45, 4, pp. 669-701, (2022)
  • [2] Fang BX, Jia Y, Li AP, Jiang R., Privacy preservation in big data: A survey, Big Data Research, 2, 1, pp. 1-18, (2016)
  • [3] Wu XD, Dong BB, Du XZ, Yang W., Data governance technology, Ruan Jian Xue Bao/Journal of Software, 30, 9, pp. 2830-2856, (2019)
  • [4] Wang Z, Liu GW, Wang Y, Li Y., Research on the development and trend of data masking technology, Information and Communications Technology and Policy, 46, 4, pp. 18-22, (2020)
  • [5] Chen XY, Gao YZ, Tang HL, Du XH., Research progress on big data security technology, SCIENTIA SINICA Informationis, 50, 1, pp. 25-66, (2020)
  • [6] Tong LL, Li PX, Duan DS, Ren BY, Li YX., Data masking model for heterogeneous big data environment, Journal of Beijing University of Aeronautics and Astronautics, 48, 2, pp. 249-257, (2022)
  • [7] Li SY, Ji YD, Shi DY, Liao WD, Zhang LP, Tong YX, Xu K., Data federation system for multi-party security, Ruan Jian Xue Bao/ Journal of Software, 33, 3, pp. 1111-1127, (2022)
  • [8] Zaharia M, Das T, Li H, Hunter T, Shenker S, Stoica I., Discretized streams: Fault-tolerant streaming computation at scale, Proc. of the 24th ACM Symp. on Operating Systems Principles, pp. 423-438, (2013)
  • [9] Carbone P, Katsifodimos A, Ewen S, Markl V, Haridi S, Tzoumas K., Apache flink: Stream and batch processing in a single engine, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 36, 4, pp. 28-38, (2015)
  • [10] Liu BX., Design and implementation of performance test tool based on TPC-DS, (2018)