Visualizing High Dimensional Datasets Using Parallel Coordinates: Application to Gene Prioritization

被引:0
|
作者
Boogaerts, Thomas [1 ]
Tranchevent, Leon-Charles [1 ]
Pavlopoulos, Georgios A. [1 ]
Aerts, Jan [1 ]
Vandewalle, Joos [1 ]
机构
[1] Katholieke Univ Leuven, ESAT SCD SISTA IBBT, KU Leuven Future Hlth Dept, B-3001 Louvain, Belgium
关键词
data visualization; parallel coordinates; genetic algorithm; gene prioritization;
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
In this paper, we introduce a visualization tool for interactive and efficient exploration of high dimensional data using parallel coordinates. An algorithm is developed to find an optimal permutation of dimensions, which allows the data miner to immediately see the most important features or irregularities in the dataset. This is implemented as a genetic algorithm based on the travelling salesman problem using maximal correlation as fitness. Other features of the tool include selection operators to group the data such as selection by intersection or by angle, orthogonal and density plots complementing the parallel coordinates plot, manual arrangement of permutation order of the dimensions, possibility to show all plots necessary to see all dimensional relations and displaying a certain number of standard deviations for each dimension separately. The tool is applied to multiple gene prioritization cases in search of genes that are relevant to certain genetic disorders. The used datasets are obtained with the MerKator and Endeavour tools and include a Breast cancer, Cataract, Charcoth-Marie-Tooth and Cardiomyopathy dataset, as well as a dataset relating 29 diseases with 22206 genes. Our tool, manual and data can be downloaded from http://www.toomas.be/parcoord/.
引用
收藏
页码:52 / 57
页数:6
相关论文
共 50 条
  • [1] Organizing and visualizing database data using parallel coordinates
    Presser, Clifton G. M.
    VISUALIZATION AND DATA ANALYSIS 2006, 2006, 6060
  • [2] Visualizing multi-dimensional electromagnetic situation in advanced parallel coordinates
    Zhou, T. (zhouti09@163.com), 1600, Huazhong University of Science and Technology (41):
  • [3] A New Metric on Parallel Coordinates and Its Application for High-Dimensional Data Visualization
    Tran Van Long
    2015 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC), 2015, : 297 - 301
  • [4] Using Penalized Regression with Parallel Coordinates for Visualization of Significance in High Dimensional Data
    Wang, Shengwen
    Yang, Yi
    Chang, Jih-Sheng
    Lin, Fang-Pang
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2013, 4 (10) : 32 - 38
  • [5] Visual Signature of High-Dimensional Geometry in Parallel Coordinates
    Yan, Xiaoqi
    Lai, Chi-Fu
    Fu, Chi-Wing
    2014 IEEE PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS), 2014, : 65 - 72
  • [6] Parallel social spider clustering algorithm for high dimensional datasets
    Shukla, Urvashi Prakash
    Nanda, Satyasai Jagannath
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 56 : 75 - 90
  • [7] Visualizing high dimensional structures in geochemical datasets using a combined compositional data analysis and Databionic swarm approach
    Engle, Mark A.
    Chaput, Julien
    INTERNATIONAL JOURNAL OF COAL GEOLOGY, 2023, 275
  • [8] Bayesian Unidimensional Scaling for visualizing uncertainty in high dimensional datasets with latent ordering of observations
    Lan Huong Nguyen
    Holmes, Susan
    BMC BIOINFORMATICS, 2017, 18
  • [9] Bayesian Unidimensional Scaling for visualizing uncertainty in high dimensional datasets with latent ordering of observations
    Lan Huong Nguyen
    Susan Holmes
    BMC Bioinformatics, 18
  • [10] Interactive Local Clustering Operations for High Dimensional Data in Parallel Coordinates
    Guo, Peihong
    Xiao, He
    Wang, Zuchao
    Yuan, Xiaoru
    IEEE PACIFIC VISUALIZATION SYMPOSIUM 2010, 2010, : 97 - 104