Safely Managing Data Variety in Big Data Software Development

被引:0
|
作者
Cerqueus, Thomas [1 ]
de Almeida, Eduardo Cunha [2 ]
Scherzinger, Stefanie [3 ]
机构
[1] Univ Lyon, CNRS, INSA Lyon, LIRIS,UMR5205, Lyon, France
[2] Univ Fed Parana, BR-80060000 Curitiba, Parana, Brazil
[3] OTH Regensburg, Regensburg, Germany
关键词
SCHEMA EVOLUTION; MODEL;
D O I
10.1109/BIGDSE.2015.9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We consider the task of building Big Data software systems, offered as software-as-a-service. These applications are commonly backed by NoSQL data stores that address the proverbial Vs of Big Data processing: NoSQL data stores can handle large volumes of data and many systems do not enforce a global schema, to account for structural variety in data. Thus, software engineers can design the data model on the go, a flexibility that is particularly crucial in agile software development. However, NoSQL data stores commonly do not yet account for the veracity of changes when it comes to changes in the structure of persisted data. Yet this is an inevitable consequence of agile software development. In most NoSQL-based application stacks, schema evolution is completely handled within the application code, usually involving object mapper libraries. Yet simple code refactorings, such as renaming a class attribute at the source code level, can cause data loss or runtime errors once the application has been deployed to production. We address this pain point by contributing type checking rules that we have implemented within an IDE plugin. Our plugin ControVol statically type checks the object mapper class declarations against the code release history. ControVol is thus capable of detecting common yet risky cases of mismatched data and schema, and can even suggest automatic fixes.
引用
收藏
页码:4 / 10
页数:7
相关论文
共 50 条
  • [21] Big Data Applications for Managing Roadways
    Mathew, Jijo K.
    Desai, Jairaj C.
    Sakhare, Rahul Suryakant
    Kim, Woosung
    Li, Howell
    Bullock, Darcy M.
    ITE JOURNAL-INSTITUTE OF TRANSPORTATION ENGINEERS, 2021, 91 (02): : 28 - 35
  • [22] Big Data sessions bring variety of opinions
    Lori A. Wilson
    MRS Bulletin, 2014, 39 : 376 - 376
  • [23] Quantifying the Impact of Big Data's Variety
    Whetsel, Robert C.
    Qu, Yanzhen
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2299 - 2303
  • [24] Cybersyn, big data, variety engineering and governance
    Espejo, Raul
    AI & SOCIETY, 2022, 37 (03) : 1163 - 1177
  • [25] BIG DATA ISSUES FOR REMOTE SENSING: VARIETY
    Pierce, Leland
    2016 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2016, : 7593 - 7596
  • [26] Big Data sessions bring variety of opinions
    Wilson, Lori A.
    MRS BULLETIN, 2014, 39 (04) : 376 - 376
  • [27] Comprehensive analysis of big data variety landscape
    Abawajy, Jemal
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2015, 30 (01) : 5 - 14
  • [28] Cybersyn, big data, variety engineering and governance
    Raul Espejo
    AI & SOCIETY, 2022, 37 : 1163 - 1177
  • [29] When Big Data Meets Software-Defined Networking: SDN for Big Data and Big Data for SDN
    Cui, Laizhong
    Yu, F. Richard
    Yan, Qiao
    IEEE NETWORK, 2016, 30 (01): : 58 - 65
  • [30] Addressing big data variety using an automated approach for data characterization
    Georgios Vranopoulos
    Nathan Clarke
    Shirley Atkinson
    Journal of Big Data, 9