Enhanced Data Mining and Visualization of Sensory-Graph-Modeled Datasets through Summarization

被引:3
|
作者
Hashmi, Syed Jalaluddin [1 ]
Alabdullah, Bayan [2 ]
Al Mudawi, Naif [3 ]
Algarni, Asaad [4 ]
Jalal, Ahmad [5 ]
Liu, Hui [6 ]
机构
[1] Natl Univ Comp & Emerging Sci, Sch Comp, Islamabad 44000, Pakistan
[2] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Syst, POB 84428, Riyadh 11671, Saudi Arabia
[3] Najran Univ, Coll Comp Sci & Informat Syst, Dept Comp Sci, Najran 55461, Saudi Arabia
[4] Northern Border Univ, Fac Comp & Informat Technol, Dept Comp Sci, Rafha 91911, Saudi Arabia
[5] Air Univ, Fac Comp & AI, E9, Islamabad 44000, Pakistan
[6] Univ Bremen, Cognit Syst Lab, D-28359 Bremen, Germany
关键词
sensors datasets; Bio-Mouse-Gene; data visualization; big data; data mining; graph summarization; weighted LSH; correction sets; STORAGE;
D O I
10.3390/s24144554
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The acquisition, processing, mining, and visualization of sensory data for knowledge discovery and decision support has recently been a popular area of research and exploration. Its usefulness is paramount because of its relationship to the continuous involvement in the improvement of healthcare and other related disciplines. As a result of this, a huge amount of data have been collected and analyzed. These data are made available for the research community in various shapes and formats; their representation and study in the form of graphs or networks is also an area of research which many scholars are focused on. However, the large size of such graph datasets poses challenges in data mining and visualization. For example, knowledge discovery from the Bio-Mouse-Gene dataset, which has over 43 thousand nodes and 14.5 million edges, is a non-trivial job. In this regard, summarizing the large graphs provided is a useful alternative. Graph summarization aims to provide the efficient analysis of such complex and large-sized data; hence, it is a beneficial approach. During summarization, all the nodes that have similar structural properties are merged together. In doing so, traditional methods often overlook the importance of personalizing the summary, which would be helpful in highlighting certain targeted nodes. Personalized or context-specific scenarios require a more tailored approach for accurately capturing distinct patterns and trends. Hence, the concept of personalized graph summarization aims to acquire a concise depiction of the graph, emphasizing connections that are closer in proximity to a specific set of given target nodes. In this paper, we present a faster algorithm for the personalized graph summarization (PGS) problem, named IPGS; this has been designed to facilitate enhanced and effective data mining and visualization of datasets from various domains, including biosensors. Our objective is to obtain a similar compression ratio as the one provided by the state-of-the-art PGS algorithm, but in a faster manner. To achieve this, we improve the execution time of the current state-of-the-art approach by using weighted, locality-sensitive hashing, through experiments on eight large publicly available datasets. The experiments demonstrate the effectiveness and scalability of IPGS while providing a similar compression ratio to the state-of-the-art approach. In this way, our research contributes to the study and analysis of sensory datasets through the perspective of graph summarization. We have also presented a detailed study on the Bio-Mouse-Gene dataset, which was conducted to investigate the effectiveness of graph summarization in the domain of biosensors.
引用
收藏
页数:24
相关论文
共 36 条
  • [1] Data mining for selective visualization of large spatial datasets
    Shekhar, S
    Lu, CT
    Zhang, PS
    Liu, RL
    14TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, : 41 - 48
  • [2] Data visualization through graph drawing
    Michailidis, G
    de Leeuw, J
    COMPUTATIONAL STATISTICS, 2001, 16 (03) : 435 - 450
  • [3] Data Visualization through Graph Drawing
    George Michailidis
    Jan de Leeuw
    Computational Statistics, 2001, 16 : 435 - 450
  • [4] Graph Summarization for Human-Understandable Visualization towards CVE Data Analysis
    Park, Ji Sun
    Kang, Mingu
    Lee, Sungryoul
    Chae, Dong-Kyu
    2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022), 2022, : 314 - 317
  • [5] On the use of perceptual cues and data mining for effective visualization of scientific datasets
    Univ of California at Berkeley, Berkeley, United States
    Proc Graphics Interface, (177-184):
  • [6] On the use of perceptual cues and data mining for effective visualization of scientific datasets
    Healey, CG
    GRAPHICS INTERFACE '98 - PROCEEDINGS, 1998, : 177 - 184
  • [7] WeVoS-ViSOM: An ensemble summarization algorithm for enhanced data visualization
    Corchado, Emilio
    Baruque, Bruno
    NEUROCOMPUTING, 2012, 75 (01) : 171 - 184
  • [8] Situation-Aware Adaptive Visualization for Sensory Data Stream Mining
    Haghighi, Pari Delir
    Gillick, Brett
    Krishnaswamy, Shonali
    Gaber, Mohamed Medhat
    Zaslavsky, Arkady
    KNOWLEDGE DISCOVERY FROM SENSOR DATA, 2010, 5840 : 43 - 58
  • [9] The Power of Summarization in Graph Mining and Learning: Smaller Data, Faster Methods, More Interpretability
    Koutra, Danai
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (13): : 3416 - 3416
  • [10] Thrips and natural enemies through text data mining and visualization
    Stopar, Karmen
    Trdan, Stanislav
    Bartol, Tomaz
    PLANT PROTECTION SCIENCE, 2021, 57 (01) : 47 - 58