Variable selection for nonlinear dimensionality reduction of biological datasets through bootstrapping of correlation networks

被引:0
|
作者
Aragones, David G. [1 ,2 ]
Palomino-Segura, Miguel [3 ,4 ,5 ]
Sicilia, Jon [3 ]
Crainiciuc, Georgiana [3 ]
Ballesteros, Ivan [3 ]
Sanchez-Cabo, Fatima [6 ]
Hidalgo, Andres [7 ,8 ]
Calvo, Gabriel F. [1 ,2 ]
机构
[1] Univ Castilla La Mancha, Dept Math, Ciudad Real, Spain
[2] Univ Castilla La Mancha, MOLAB Math Oncol Lab, Ciudad Real, Spain
[3] Ctr Nacl Invest Cardiovasc Carlos III, Area Cell & Dev Biol, Madrid, Spain
[4] Inst Univ Invest Biosanitaria Extremadura INUBE, Immunophysiol Res Grp, Badajoz, Spain
[5] Univ Extremadura, Fac Sci, Dept Physiol, Badajoz, Spain
[6] Ctr Nacl Invest Cardiovasc Carlos III, Bioinformat Unit, Madrid, Spain
[7] Yale Univ, Sch Med, Vasc Biol & Therapeut Program, New Haven, CT USA
[8] Yale Univ, Dept Immunobiol, Sch Med, New Haven, CT USA
关键词
Artificial intelligence; Machine learning; Unsupervised learning; Feature selection; UMAP; Complex systems; SEQ DATA; CELL; MODEL;
D O I
10.1016/j.compbiomed.2023.107827
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identifying the most relevant variables or features in massive datasets for dimensionality reduction can lead to improved and more informative display, faster computation times, and more explainable models of complex systems. Despite significant advances and available algorithms, this task generally remains challenging, especially in unsupervised settings. In this work, we propose a method that constructs correlation networks using all intervening variables and then selects the most informative ones based on network bootstrapping. The method can be applied in both supervised and unsupervised scenarios. We demonstrate its functionality by applying Uniform Manifold Approximation and Projection for dimensionality reduction to several highdimensional biological datasets, derived from 4D live imaging recordings of hundreds of morpho-kinetic variables, describing the dynamics of thousands of individual leukocytes at sites of prominent inflammation. We compare our method with other standard ones in the field, such as Principal Component Analysis and Elastic Net, showing that it outperforms them. The proposed method can be employed in a wide range of applications, encompassing data analysis and machine learning.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Nonlinear dimensionality reduction method of scheduling frequent information in wireless networks based on multilevel mapping
    Sun, Jian-zhao
    Yang, Kun
    Wozniak, Marcin
    WIRELESS NETWORKS, 2023, 29 (07) : 2897 - 2907
  • [42] Feature Selection and Dimensionality Reduction of Discharge Acoustic Signal Based on Correlation and Between-class Difference
    Ma J.
    Yang G.
    Cao P.
    Bao Y.
    Feng T.
    Gaodianya Jishu/High Voltage Engineering, 2023, 49 (03): : 1194 - 1204
  • [43] Knowledge discovery in medical and biological datasets by integration of Relief-F and correlation feature selection techniques
    Shukla, Alok Kumar
    Pippal, Sanjeev Kumar
    Gupta, Srishti
    Reddy, B. Ramachandra
    Tripathi, Diwakar
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 6637 - 6648
  • [44] Variable Selection of High-Dimensional Non-Parametric Nonlinear Systems: a Way to Avoid the Curse of Dimensionality
    Bai, Er-wei
    Cheng, Changmin
    Zhao, Wenxiao
    Chen, Han-Fu
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [45] Cost-Sensitive Variable Selection for Multi-Class Imbalanced Datasets Using Bayesian Networks
    Ramos-Lopez, Dario
    Maldonado, Ana D.
    MATHEMATICS, 2021, 9 (02) : 1 - 15
  • [46] Dimensionality reduction in Nonlinear optical datasets via diffusion mapping: Case study of short-pulse second harmonic generation
    Romanov, Dmitri
    Smith, Stanley
    Brady, John
    Levis, Robert J.
    IMAGING, MANIPULATION, AND ANALYSIS OF BIOMOLECULES, CELLS, AND TISSUES VI, 2008, 6859
  • [47] Behavior of Linear and Nonlinear Dimensionality Reduction for Collective Variable Identification of Small Molecule Solution-Phase Reactions
    Le, Hung M.
    Kumar, Sushant
    May, Nathan
    Martinez-Baez, Ernesto
    Sundararaman, Ravishankar
    Krishnamoorthy, Bala
    Clark, Aurora E.
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (03) : 1286 - 1296
  • [48] Exploration of nonlinear parallel heterogeneous reaction pathways through Bayesian variable selection
    Oyanagi, Ryosuke X.
    Kuwatani, Tatsu
    Omori, Toshiaki
    EUROPEAN PHYSICAL JOURNAL B, 2021, 94 (02):
  • [49] Exploration of nonlinear parallel heterogeneous reaction pathways through Bayesian variable selection
    Ryosuke X. Oyanagi
    Tatsu Kuwatani
    Toshiaki Omori
    The European Physical Journal B, 2021, 94
  • [50] Enhancing solar radiation predictions through COA optimized neural networks and PCA dimensionality reduction
    Fariz, T. K. Nida
    Basha, S. Sharief
    ENERGY REPORTS, 2024, 12 : 341 - 359