Safely Managing Data Variety in Big Data Software Development

被引:0
|
作者
Cerqueus, Thomas [1 ]
de Almeida, Eduardo Cunha [2 ]
Scherzinger, Stefanie [3 ]
机构
[1] Univ Lyon, CNRS, INSA Lyon, LIRIS,UMR5205, Lyon, France
[2] Univ Fed Parana, BR-80060000 Curitiba, Parana, Brazil
[3] OTH Regensburg, Regensburg, Germany
关键词
SCHEMA EVOLUTION; MODEL;
D O I
10.1109/BIGDSE.2015.9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We consider the task of building Big Data software systems, offered as software-as-a-service. These applications are commonly backed by NoSQL data stores that address the proverbial Vs of Big Data processing: NoSQL data stores can handle large volumes of data and many systems do not enforce a global schema, to account for structural variety in data. Thus, software engineers can design the data model on the go, a flexibility that is particularly crucial in agile software development. However, NoSQL data stores commonly do not yet account for the veracity of changes when it comes to changes in the structure of persisted data. Yet this is an inevitable consequence of agile software development. In most NoSQL-based application stacks, schema evolution is completely handled within the application code, usually involving object mapper libraries. Yet simple code refactorings, such as renaming a class attribute at the source code level, can cause data loss or runtime errors once the application has been deployed to production. We address this pain point by contributing type checking rules that we have implemented within an IDE plugin. Our plugin ControVol statically type checks the object mapper class declarations against the code release history. ControVol is thus capable of detecting common yet risky cases of mismatched data and schema, and can even suggest automatic fixes.
引用
收藏
页码:4 / 10
页数:7
相关论文
共 50 条
  • [1] Managing big data
    Tracy H. Schloemer
    Nature Energy, 2022, 7 : 122 - 123
  • [2] Managing big data
    Schloemer, Tracy H.
    NATURE ENERGY, 2022, 7 (02) : 122 - 123
  • [3] The Software Engineering Education in Computer Software Development with Big Data
    Chen, Jian
    International Journal for Housing Science and Its Applications, 2023, 44 (01): : 19 - 28
  • [4] Review of Data Analysis Framework for Variety of Big Data
    Arora, Yojna
    Goyal, Dinesh
    EMERGING TRENDS IN EXPERT APPLICATIONS AND SECURITY, 2019, 841 : 55 - 62
  • [5] Managing Spatial Big Data on the Data LakeHouse
    Errami, Soukaina Ait
    Hajji, Hicham
    El Kadi, Kenza Ait
    Badir, Hassan
    EMERGING TRENDS IN INTELLIGENT SYSTEMS & NETWORK SECURITY, 2023, 147 : 323 - 331
  • [6] Software Engineering Issues in Big Data Application Development
    Karakaya, Ziya
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 851 - 855
  • [7] Managing Big Neuroimaging Data
    Kesselman, Carl
    INTERNATIONAL JOURNAL OF PSYCHOPHYSIOLOGY, 2016, 108 : 30 - 30
  • [8] Managing big data integrity
    Lebdaoui, Imane
    El Hajji, Said
    Orhanou, Ghizlane
    2016 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2016,
  • [9] Qualitative and quantitative variety of big data
    Gruson, D.
    CLINICA CHIMICA ACTA, 2019, 493 : S754 - S755
  • [10] Towards a Big Data Requirements Engineering Artefact Model in the Context of Big Data Software Development Projects
    Arruda, Darlan
    Madhavji, Nazim H.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4725 - 4726