Variable selection for nonlinear dimensionality reduction of biological datasets through bootstrapping of correlation networks

被引:0
|
作者
Aragones, David G. [1 ,2 ]
Palomino-Segura, Miguel [3 ,4 ,5 ]
Sicilia, Jon [3 ]
Crainiciuc, Georgiana [3 ]
Ballesteros, Ivan [3 ]
Sanchez-Cabo, Fatima [6 ]
Hidalgo, Andres [7 ,8 ]
Calvo, Gabriel F. [1 ,2 ]
机构
[1] Univ Castilla La Mancha, Dept Math, Ciudad Real, Spain
[2] Univ Castilla La Mancha, MOLAB Math Oncol Lab, Ciudad Real, Spain
[3] Ctr Nacl Invest Cardiovasc Carlos III, Area Cell & Dev Biol, Madrid, Spain
[4] Inst Univ Invest Biosanitaria Extremadura INUBE, Immunophysiol Res Grp, Badajoz, Spain
[5] Univ Extremadura, Fac Sci, Dept Physiol, Badajoz, Spain
[6] Ctr Nacl Invest Cardiovasc Carlos III, Bioinformat Unit, Madrid, Spain
[7] Yale Univ, Sch Med, Vasc Biol & Therapeut Program, New Haven, CT USA
[8] Yale Univ, Dept Immunobiol, Sch Med, New Haven, CT USA
关键词
Artificial intelligence; Machine learning; Unsupervised learning; Feature selection; UMAP; Complex systems; SEQ DATA; CELL; MODEL;
D O I
10.1016/j.compbiomed.2023.107827
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identifying the most relevant variables or features in massive datasets for dimensionality reduction can lead to improved and more informative display, faster computation times, and more explainable models of complex systems. Despite significant advances and available algorithms, this task generally remains challenging, especially in unsupervised settings. In this work, we propose a method that constructs correlation networks using all intervening variables and then selects the most informative ones based on network bootstrapping. The method can be applied in both supervised and unsupervised scenarios. We demonstrate its functionality by applying Uniform Manifold Approximation and Projection for dimensionality reduction to several highdimensional biological datasets, derived from 4D live imaging recordings of hundreds of morpho-kinetic variables, describing the dynamics of thousands of individual leukocytes at sites of prominent inflammation. We compare our method with other standard ones in the field, such as Principal Component Analysis and Elastic Net, showing that it outperforms them. The proposed method can be employed in a wide range of applications, encompassing data analysis and machine learning.
引用
收藏
页数:21
相关论文
共 50 条
  • [11] Relevance Units Latent Variable Model and Nonlinear Dimensionality Reduction
    Gao, Junbin
    Zhang, Jun
    Tien, David
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (01): : 123 - 135
  • [12] Variable Selection through Correlation Sifting
    Huang, Jim C.
    Jojic, Nebojsa
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, 2011, 6577 : 106 - 123
  • [13] Nonlinear Dimensionality Reduction and Feature Analysis for Artifact Component Identification in hdEEG Datasets
    Koudelka, Vlastimil
    Strobl, Jan
    Piorecky, Marek
    Brunovsky, Martin
    Krajca, Vladimir
    WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING 2018, VOL 2, 2019, 68 (02): : 415 - 419
  • [14] Dimensionality reduction through clustering of variables and canonical correlation
    Munoz-Pichardo, Juan M.
    Pino-Mejias, Rafael
    Cubiles-de-la-Vega, M. Dolores
    Enguix-Gonzalez, Alicia
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2025, 54 (01) : 63 - 90
  • [15] Ferroelectric Memristive Networks for Dimensionality Reduction: A Process for Effectively Classifying Cancer Datasets
    Raj, P. Michael Preetam
    Louis, V. Jeffry
    Chatterjee, Sumit Kumar
    Kanungo, Sayan
    Kundu, Souvik
    INTEGRATED FERROELECTRICS, 2019, 201 (01) : 126 - 141
  • [16] Linear Dimensionality Reduction through Eigenvector Selection for Object Recognition
    Dornaika, F.
    Assoum, A.
    ADVANCES IN VISUAL COMPUTING, PT I, 2010, 6453 : 276 - +
  • [17] Orthogonal Subspace Based Nonlinear Correlation Learning for Supervised Dimensionality Reduction
    Zhang, Zhao
    Ye, Ning
    Deng, Ning
    Du, Hui
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 779 - +
  • [18] The Curse of Dimensionality and Variable Selection in Nonlinear Non-parametric System Identification
    Bai, Er-Wei
    Zhao, Wenxiao
    Zheng, Wei Xing
    IFAC PAPERSONLINE, 2015, 48 (28): : 1279 - 1284
  • [19] A Randomized Subspace-based Approach for Dimensionality Reduction and Important Variable Selection
    Bo, Di
    Hwangbo, Hoon
    Sharma, Vinit
    Arndt, Corey
    Termaath, Stephanie
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [20] Graph Based Feature Selection for Reduction of Dimensionality in Next-Generation RNA Sequencing Datasets
    Gakii, Consolata
    Mireji, Paul O.
    Rimiru, Richard
    ALGORITHMS, 2022, 15 (01)