A Comparison of Dimensionality Reduction Methods for Large Biological Data

被引:0
|
作者
Babjac, Ashley [1 ]
Royalty, Taylor [2 ]
Steen, Andrew D. [3 ]
Emrich, Scott J. [1 ]
机构
[1] Univ Tennessee, Dept Elect Engn & Comp Sci, Knoxville, TN 37996 USA
[2] Univ Tennessee, Dept Earth & Planetary Sci, Knoxville, TN USA
[3] Univ Tennessee, Dept Microbiol, Knoxville, TN 37996 USA
关键词
autoencoders; dimensionality reduction; classification;
D O I
10.1145/3535508.3545536
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large-scale data often suffer from the curse of dimensionality and the constraints associated with it; therefore, dimensionality reduction methods are often performed prior to most machine learning pipelines. In this paper, we directly compare autoencoders performance as a dimensionality reduction technique (via the latent space) to other established methods: PCA, LASSO, and t-SNE. To do so, we use four distinct datasets that vary in the types of features, metadata, labels, and size to robustly compare different methods. We test prediction capability using both Support Vector Machines (SVM) and Random Forests (RF). Significantly, we conclude that autoencoders are an equivalent dimensionality reduction architecture to the previously established methods, and often outperform them in both prediction accuracy and time performance when condensing large, sparse datasets.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Comparison between Two Dimensionality Reduction Methods in Time Series
    Zhang, Hanwen
    REVISTA COLOMBIANA DE ESTADISTICA, 2009, 32 (02): : 189 - 212
  • [32] Geometric MDS Performance for Large Data Dimensionality Reduction and Visualization
    Dzemyda, Gintautas
    Sabaliauskas, Martynas
    Medvedev, Viktor
    INFORMATICA, 2022, 33 (02) : 299 - 320
  • [33] A virtual reality data visualization tool for dimensionality reduction methods
    Juan C. Morales-Vega
    Laura Raya
    Manuel Rubio-Sánchez
    Alberto Sanchez
    Virtual Reality, 2024, 28
  • [34] Application of Dimensionality Reduction Methods for Eye Movement Data Classification
    Gruca, Aleksandra
    Harezlak, Katarzyna
    Kasprowski, Pawel
    MAN-MACHINE INTERACTIONS 4, ICMMI 2015, 2016, 391 : 291 - 303
  • [35] A virtual reality data visualization tool for dimensionality reduction methods
    Morales-Vega, Juan C.
    Raya, Laura
    Rubio-Sanchez, Manuel
    Sanchez, Alberto
    VIRTUAL REALITY, 2024, 28 (01)
  • [36] DIMENSIONALITY REDUCTION OF CATEGORICAL DATA: COMPARISON OF HCA AND CATPCA APPROACHES
    Sulc, Zdenek
    Rezankova, Hana
    18TH AMSE: APPLICATIONS OF MATHEMATICS AND STATISTICS IN ECONOMICS, 2015,
  • [37] Comparison of RFID Data Processing Using Dimensionality Reduction Techniques
    Anu, Maria, V
    Mala, G. S. Anandha
    Mathi, K.
    2014 INTERNATIONAL CONFERENCE ON CONTROL, INSTRUMENTATION, COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICCICCT), 2014, : 265 - 268
  • [38] Comparison of dimensionality reduction methods on hyperspectral images for the identification of heathlands and mires
    Jarocinska, Anna
    Kopec, Dominik
    Kycko, Marlena
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [39] A Comparison of Linear and Nonlinear Dimensionality Reduction Methods Applied to Synthetic Speech
    Errity, Andrew
    McKenna, John
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1079 - 1082
  • [40] Comparison of Kernel Entropy Component Analysis with Several Dimensionality Reduction Methods
    马西沛
    张蕾
    孙以泽
    JournalofDonghuaUniversity(EnglishEdition), 2017, 34 (04) : 577 - 582