Big Data: from collection to visualization

被引:0
|
作者
Mohammed Ghesmoune
Hanene Azzag
Salima Benbernou
Mustapha Lebbah
Tarn Duong
Mourad Ouziri
机构
[1] University of Paris 13,LIPN
[2] Sorbonne Paris City,UMR 7030
[3] University of Paris Descartes, CNRS
[4] Sorbonne Paris City,LIPADE
来源
Machine Learning | 2017年 / 106卷
关键词
Data fusion; RDF; Semantic; Entity resolution; Big data; Map-Reduce; Spark; Data stream clustering; Micro-Batch streaming; GNG; Topological structure; Visualization;
D O I
暂无
中图分类号
学科分类号
摘要
Organisations are increasingly relying on Big Data to provide the opportunities to discover correlations and patterns in data that would have previously remained hidden, and to subsequently use this new information to increase the quality of their business activities. In this paper we present a ‘story’ of Big Data from the initial data collection and to the end visualization, passing by the data fusion, and the analysis and clustering tasks. For this, we present a complete work flow on (a) how to represent the heterogeneous collected data using the high performance RDF language, how to perform the fusion of the Big Data in RDF by resolving the issue of entity disambiguity and how to query those data to provide more relevant and complete knowledge and (b) as the data are received in data streams, we propose batchStream, a Micro-Batching version of the growing neural gas approach, which is capable of clustering data streams with a single pass over the data. The batchStream algorithm allows us to discover clusters of arbitrary shapes without any assumptions on the number of clusters. This Big Data work flow is implemented in the Spark platform and we demonstrate it on synthetic and real data.
引用
收藏
页码:837 / 862
页数:25
相关论文
共 50 条
  • [41] Research on the Fuzziness in the Design of Big Data Visualization
    Lei, Tian
    Zhu, Qiumeng
    Ni, Nan
    He, Xin
    HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION: INTERACTION, VISUALIZATION, AND ANALYTICS, HIMI 2018 HELD AS PART OF HCI 2018, PART I, 2018, 10904 : 70 - 77
  • [42] Research Progress of Tumor Big Data Visualization
    Chen, Xingyu
    Liu, Bin
    ELECTRONICS, 2023, 12 (03)
  • [43] Visualization: A novel approach for big data analytics
    Kumar, Omesh
    Goyal, Abhishek
    2016 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE & COMMUNICATION TECHNOLOGY (CICT), 2016, : 121 - 124
  • [44] Augmented Reality for Big Data Visualization: A Review
    Chandra, Ananth N. Ramaseri
    El Jamiy, Fatima
    Reza, Hassan
    2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 1269 - 1274
  • [45] Big Data Visualization in Smart Cyber University
    Soklakova, Tetiana
    Ziarmand, Artur
    Osadchyieva, Svitlana
    PROCEEDINGS OF 2016 IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS), 2016,
  • [46] Big Data Analysis and Visualization: Challenges and Solutions
    Yoo, Kwan-Hee
    Leung, Carson K.
    Nasridinov, Aziz
    APPLIED SCIENCES-BASEL, 2022, 12 (16):
  • [47] Application of Big Data Visualization in Urban Planning
    Cao, Xinhui
    Wang, Mei
    Liu, Xin
    2019 5TH INTERNATIONAL CONFERENCE ON ENVIRONMENTAL SCIENCE AND MATERIAL APPLICATION, 2020, 440
  • [48] Research on Visualization and Application of Medical Big Data
    Zhao, Hang
    Li, Guijie
    Feng, Wei
    2018 INTERNATIONAL CONFERENCE ON ROBOTS & INTELLIGENT SYSTEM (ICRIS 2018), 2018, : 383 - 386
  • [49] An Innovative Methodology for Big Data Visualization for Telemedicine
    Galletta, Antonino
    Carnevale, Lorenzo
    Bramanti, Alessia
    Fazio, Maria
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (01) : 490 - 497
  • [50] BIG-DATA VISUALIZATION FOR TRANSLATIONAL NEUROTRAUMA
    Nielson, Jessica
    Inoue, Tomoo
    Paquette, Jesse
    Lin, Amity
    Sacramento, Jeffrey
    Liu, Aiwen W.
    Guandique, Cristian F.
    Irvine, Karen-Amanda
    Gensel, John C.
    Beattie, Michael S.
    Bresnahan, Jacqueline C.
    Manley, Geoffrey T.
    Carlsson, Gunnar
    Lum, Pek Yee
    Ferguson, Adam R.
    JOURNAL OF NEUROTRAUMA, 2013, 30 (15) : A61 - A62