Confidence estimation for t-SNE embeddings using random forest

被引:0
|
作者
Busra Ozgode Yigin
Gorkem Saygili
机构
[1] Tilburg University,Cognitive Sciences and Artificial Intelligence, Tilburg School of Humanities and Digital Sciences
来源
International Journal of Machine Learning and Cybernetics | 2022年 / 13卷
关键词
T-SNE; Confidence score; Embedding; Dimensionality reduction; Random forest;
D O I
暂无
中图分类号
学科分类号
摘要
Dimensionality reduction algorithms are commonly used for reducing the dimension of multi-dimensional data to visualize them on a standard display. Although many dimensionality reduction algorithms such as the t-distributed Stochastic Neighborhood Embedding aim to preserve close neighborhoods in low-dimensional space, they might not accomplish that for every sample of the data and eventually produce erroneous representations. In this study, we developed a supervised confidence estimation algorithm for detecting erroneous samples in embeddings. Our algorithm generates a confidence score for each sample in an embedding based on a distance-oriented score and a random forest regressor. We evaluate its performance on both intra- and inter-domain data and compare it with the neighborhood preservation ratio as our baseline. Our results showed that the resulting confidence score provides distinctive information about the correctness of any sample in an embedding compared to the baseline. The source code is available at https://github.com/gsaygili/dimred.
引用
收藏
页码:3981 / 3992
页数:11
相关论文
共 50 条
  • [1] Confidence estimation for t-SNE embeddings using random forest
    Yigin, Busra Ozgode
    Saygili, Gorkem
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (12) : 3981 - 3992
  • [2] Conditional t-SNE: more informative t-SNE embeddings
    Bo Kang
    Darío García García
    Jefrey Lijffijt
    Raúl Santos-Rodríguez
    Tijl De Bie
    Machine Learning, 2021, 110 : 2905 - 2940
  • [3] Conditional t-SNE: more informative t-SNE embeddings
    Kang, Bo
    Garcia Garcia, Dario
    Lijffijt, Jefrey
    Santos-Rodriguez, Raul
    De Bie, Tijl
    MACHINE LEARNING, 2021, 110 (10) : 2905 - 2940
  • [4] Conditional t-SNE: More informative t-SNE embeddings
    Kang, Bo
    Garcia, Dario Garcia
    Lijffijt, Jefrey
    Santos-Rodriguez, Raul
    De Bie, Tijl
    2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2021,
  • [5] Data Segmentation via t-SNE, DBSCAN, and Random Forest
    DeLise, Timothy
    INTELLIGENT COMPUTING, VOL 2, 2021, 284 : 139 - 151
  • [6] Visualizing Data using t-SNE
    van der Maaten, Laurens
    Hinton, Geoffrey
    JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 2579 - 2605
  • [7] Visualizing data using t-SNE
    TiCC, Ttlburg University, P.O. Box 90153, 5000 LE Tilburg, Netherlands
    不详
    J. Mach. Learn. Res., 2008, (2579-2625):
  • [8] Wasserstein t-SNE
    Bachmann, Fynn
    Hennig, Philipp
    Kobak, Dmitry
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 13713 : 104 - 120
  • [9] Optimizing graph layout by t-SNE perplexity estimation
    Xiao, Chun
    Hong, Seokhee
    Huang, Weidong
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2023, 15 (02) : 159 - 171
  • [10] A Review of t-SNE
    Jung, Sangwon
    Dagobert, Tristan
    Morel, Jean-Michel
    Facciolo, Gabriele
    IMAGE PROCESSING ON LINE, 2024, 14 : 250 - 270