Variable selection for nonlinear dimensionality reduction of biological datasets through bootstrapping of correlation networks

被引:0
|
作者
Aragones, David G. [1 ,2 ]
Palomino-Segura, Miguel [3 ,4 ,5 ]
Sicilia, Jon [3 ]
Crainiciuc, Georgiana [3 ]
Ballesteros, Ivan [3 ]
Sanchez-Cabo, Fatima [6 ]
Hidalgo, Andres [7 ,8 ]
Calvo, Gabriel F. [1 ,2 ]
机构
[1] Univ Castilla La Mancha, Dept Math, Ciudad Real, Spain
[2] Univ Castilla La Mancha, MOLAB Math Oncol Lab, Ciudad Real, Spain
[3] Ctr Nacl Invest Cardiovasc Carlos III, Area Cell & Dev Biol, Madrid, Spain
[4] Inst Univ Invest Biosanitaria Extremadura INUBE, Immunophysiol Res Grp, Badajoz, Spain
[5] Univ Extremadura, Fac Sci, Dept Physiol, Badajoz, Spain
[6] Ctr Nacl Invest Cardiovasc Carlos III, Bioinformat Unit, Madrid, Spain
[7] Yale Univ, Sch Med, Vasc Biol & Therapeut Program, New Haven, CT USA
[8] Yale Univ, Dept Immunobiol, Sch Med, New Haven, CT USA
关键词
Artificial intelligence; Machine learning; Unsupervised learning; Feature selection; UMAP; Complex systems; SEQ DATA; CELL; MODEL;
D O I
10.1016/j.compbiomed.2023.107827
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identifying the most relevant variables or features in massive datasets for dimensionality reduction can lead to improved and more informative display, faster computation times, and more explainable models of complex systems. Despite significant advances and available algorithms, this task generally remains challenging, especially in unsupervised settings. In this work, we propose a method that constructs correlation networks using all intervening variables and then selects the most informative ones based on network bootstrapping. The method can be applied in both supervised and unsupervised scenarios. We demonstrate its functionality by applying Uniform Manifold Approximation and Projection for dimensionality reduction to several highdimensional biological datasets, derived from 4D live imaging recordings of hundreds of morpho-kinetic variables, describing the dynamics of thousands of individual leukocytes at sites of prominent inflammation. We compare our method with other standard ones in the field, such as Principal Component Analysis and Elastic Net, showing that it outperforms them. The proposed method can be employed in a wide range of applications, encompassing data analysis and machine learning.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Training Echo State Networks with Regularization Through Dimensionality Reduction
    Sigurd Løkse
    Filippo Maria Bianchi
    Robert Jenssen
    Cognitive Computation, 2017, 9 : 364 - 378
  • [22] Training Echo State Networks with Regularization Through Dimensionality Reduction
    Lokse, Sigurd
    Bianchi, Filippo Maria
    Jenssen, Robert
    COGNITIVE COMPUTATION, 2017, 9 (03) : 364 - 378
  • [23] Nonlinear Variable Selection via Deep Neural Networks
    Chen, Yao
    Gao, Qingyi
    Liang, Faming
    Wang, Xiao
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2021, 30 (02) : 484 - 492
  • [24] Adaptive Virtual Resource Clustering and Monitoring through Nonlinear Dimensionality Reduction
    Wang, Zihou
    Han, Yanni
    Lin, Tao
    2014 SIXTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN 2014), 2014, : 533 - 538
  • [25] Nonlinear Dimensionality Reduction of Hyperspectral Data Using Spectral Correlation as a Similarity Measure
    Myasnikov, Evgeny
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, AIST 2017, 2018, 10716 : 237 - 244
  • [26] A Local Similarity-Preserving Framework for Nonlinear Dimensionality Reduction with Neural Networks
    Wang, Xiang
    Li, Xiaoyong
    Zhu, Junxing
    Xu, Zichen
    Ren, Kaijun
    Zhang, Weiming
    Liu, Xinwang
    Yu, Kui
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 376 - 391
  • [27] BAYESIAN VARIABLE SELECTION AND DATA INTEGRATION FOR BIOLOGICAL REGULATORY NETWORKS
    Jensen, Shane T.
    Chen, Guang
    Stoeckert, Christian J., Jr.
    ANNALS OF APPLIED STATISTICS, 2007, 1 (02): : 612 - 633
  • [28] Combining in silico evolution and nonlinear dimensionality reduction to redesign responses of signaling networks
    Prescott, Aaron M.
    Abel, Steven M.
    PHYSICAL BIOLOGY, 2016, 13 (06)
  • [29] VARIABLE SUBSET SELECTION FOR BRAIN-COMPUTER INTERFACE PCA-based Dimensionality Reduction and Feature Selection
    Dias, N. S.
    Kamrunnahar, M.
    Mendes, P. M.
    Schiff, S. J.
    Correia, J. H.
    BIOSIGNALS 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIO-INSPIRED SYSTEMS AND SIGNAL PROCESSING, 2009, : 35 - +
  • [30] Diabetes Prediction: Optimization of Machine Learning through Feature Selection and Dimensionality Reduction
    Aouragh, Abd Allah
    Bahaj, Mohamed
    Toufik, Fouad
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2024, 20 (08) : 100 - 114