Safely Managing Data Variety in Big Data Software Development

被引:0
|
作者
Cerqueus, Thomas [1 ]
de Almeida, Eduardo Cunha [2 ]
Scherzinger, Stefanie [3 ]
机构
[1] Univ Lyon, CNRS, INSA Lyon, LIRIS,UMR5205, Lyon, France
[2] Univ Fed Parana, BR-80060000 Curitiba, Parana, Brazil
[3] OTH Regensburg, Regensburg, Germany
关键词
SCHEMA EVOLUTION; MODEL;
D O I
10.1109/BIGDSE.2015.9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We consider the task of building Big Data software systems, offered as software-as-a-service. These applications are commonly backed by NoSQL data stores that address the proverbial Vs of Big Data processing: NoSQL data stores can handle large volumes of data and many systems do not enforce a global schema, to account for structural variety in data. Thus, software engineers can design the data model on the go, a flexibility that is particularly crucial in agile software development. However, NoSQL data stores commonly do not yet account for the veracity of changes when it comes to changes in the structure of persisted data. Yet this is an inevitable consequence of agile software development. In most NoSQL-based application stacks, schema evolution is completely handled within the application code, usually involving object mapper libraries. Yet simple code refactorings, such as renaming a class attribute at the source code level, can cause data loss or runtime errors once the application has been deployed to production. We address this pain point by contributing type checking rules that we have implemented within an IDE plugin. Our plugin ControVol statically type checks the object mapper class declarations against the code release history. ControVol is thus capable of detecting common yet risky cases of mismatched data and schema, and can even suggest automatic fixes.
引用
收藏
页码:4 / 10
页数:7
相关论文
共 50 条
  • [31] Addressing big data variety using an automated approach for data characterization
    Vranopoulos, Georgios
    Clarke, Nathan
    Atkinson, Shirley
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [32] LSDStrategy: A Lightweight Software-Driven Strategy for Addressing Big Data Variety of Multimedia Streaming
    Khudhur, Saja Dheyaa
    Jeiad, Hassan Awheed
    IEEE ACCESS, 2022, 10 : 111794 - 111810
  • [33] A Model for Detecting and Managing Unrecognized Data in a Big Data framework
    Das, Ananta Chandra
    Mohanty, Sachi Nandan
    Prasad, Arupananda Girish
    Swain, Aparimita
    2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), 2016, : 3517 - 3522
  • [34] A Big Data Architecture for Managing Oceans of Data and Maritime Applications
    Lytra, Ioanna
    Vidal, Maria-Esther
    Orlandi, Fabrizio
    Attard, Judie
    2017 INTERNATIONAL CONFERENCE ON ENGINEERING, TECHNOLOGY AND INNOVATION (ICE/ITMC), 2017, : 1216 - 1226
  • [35] Software solutions: Managing data intelligently
    Foundry Trade J, 2006, 3631 (11-12):
  • [36] Development of a Modularized Undergraduate Data Science and Big Data Curricular Using No-Code Software Development Tools
    Mafukidze, Harry D.
    Nechibvute, Action
    Yahya, Abid
    Badruddin, Irfan Anjum
    Kamangar, Sarfaraz
    Hussien, Mohamed
    IEEE ACCESS, 2024, 12 : 100939 - 100956
  • [37] Records Management and Big Data Environment: The roles of records professional in managing big data
    Asnawi, Nor Sakila
    Abd Kadir, Irwan Kamaruddin
    Ab Rahman, Azmi
    Yunus, Alwi Mohd
    ENVIRONMENT-BEHAVIOUR PROCEEDINGS JOURNAL, 2022, 7 : 213 - 217
  • [38] Development of Web-based software for managing data in drug design
    Faver, John C.
    Jorgensen, William L.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2014, 247
  • [39] Big Impacts and Big Data: Addressing the Challenges of Managing DebriSat's Characterization Data
    Kleespies, Joseph
    Fitz-Coy, Norman
    2016 IEEE AEROSPACE CONFERENCE, 2016,
  • [40] BIG DATA FOR DEVELOPMENT
    Kirkpatrick, Robert
    BIG DATA, 2013, 1 (01) : 3 - 4