Confidence estimation for t-SNE embeddings using random forest

被引:0
|
作者
Busra Ozgode Yigin
Gorkem Saygili
机构
[1] Tilburg University,Cognitive Sciences and Artificial Intelligence, Tilburg School of Humanities and Digital Sciences
来源
International Journal of Machine Learning and Cybernetics | 2022年 / 13卷
关键词
T-SNE; Confidence score; Embedding; Dimensionality reduction; Random forest;
D O I
暂无
中图分类号
学科分类号
摘要
Dimensionality reduction algorithms are commonly used for reducing the dimension of multi-dimensional data to visualize them on a standard display. Although many dimensionality reduction algorithms such as the t-distributed Stochastic Neighborhood Embedding aim to preserve close neighborhoods in low-dimensional space, they might not accomplish that for every sample of the data and eventually produce erroneous representations. In this study, we developed a supervised confidence estimation algorithm for detecting erroneous samples in embeddings. Our algorithm generates a confidence score for each sample in an embedding based on a distance-oriented score and a random forest regressor. We evaluate its performance on both intra- and inter-domain data and compare it with the neighborhood preservation ratio as our baseline. Our results showed that the resulting confidence score provides distinctive information about the correctness of any sample in an embedding compared to the baseline. The source code is available at https://github.com/gsaygili/dimred.
引用
收藏
页码:3981 / 3992
页数:11
相关论文
共 50 条
  • [21] ENS-t-SNE: Embedding Neighborhoods Simultaneously t-SNE
    Miller, Jacob
    Huroyan, Vahan
    Navarrete, Raymundo
    Hossain, Md Iqbal
    Kobourov, Stephen
    2024 IEEE 17TH PACIFIC VISUALIZATION CONFERENCE, PACIFICVIS, 2024, : 222 - 231
  • [22] Unsupervised Clustering of Hyperspectral Paper Data Using t-SNE
    Melit Devassy, Binu
    George, Sony
    Nussbaum, Peter
    JOURNAL OF IMAGING, 2020, 6 (05)
  • [23] Fast Similarity Computation for t-SNE
    Fujiwara, Yasuhiro
    Ida, Yasutoshi
    Kanai, Sekitoshi
    Kumagai, Atsutoshi
    Ueda, Naonori
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 1691 - 1702
  • [24] Accelerating t-SNE using tree-based algorithms
    Van Der Maaten, Laurens
    Journal of Machine Learning Research, 2015, 15 : 3221 - 3245
  • [25] Projected t-SNE for batch correction
    Aliverti, Emanuele
    Tilson, Jeffrey L.
    Filer, Dayne L.
    Babcock, Benjamin
    Colaneri, Alejandro
    Ocasio, Jennifer
    Gershon, Timothy R.
    Wilhelmsen, Kirk C.
    Dunson, David B.
    BIOINFORMATICS, 2020, 36 (11) : 3522 - 3527
  • [26] Using Global t-SNE to Preserve Intercluster Data Structure
    Zhou, Yuansheng
    Sharpee, Tatyana O.
    NEURAL COMPUTATION, 2022, 34 (08) : 1637 - 1651
  • [27] The art of using t-SNE for single-cell transcriptomics
    Kobak, Dmitry
    Berens, Philipp
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [28] Parametric nonlinear dimensionality reduction using kernel t-SNE
    Gisbrecht, Andrej
    Schulz, Alexander
    Hammer, Barbara
    NEUROCOMPUTING, 2015, 147 : 71 - 82
  • [29] The art of using t-SNE for single-cell transcriptomics
    Dmitry Kobak
    Philipp Berens
    Nature Communications, 10
  • [30] Accelerating t-SNE using Tree-Based Algorithms
    van der Maaten, Laurens
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 3221 - 3245