Geometric Dataset Distances via Optimal Transport

被引:0
|
作者
Alvarez-Melis, David [1 ]
Fusi, Nicolo [1 ]
机构
[1] Microsoft Res New England, Cambridge, MA 02142 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The notion of task similarity is at the core of various machine learning paradigms, such as domain adaptation and meta-learning. Current methods to quantify it are often heuristic, make strong assumptions on the label sets across the tasks, and many are architecture-dependent, relying on task-specific optimal parameters (e.g., require training a model on each dataset). In this work we propose an alternative notion of distance between datasets that (i) is model-agnostic, (ii) does not involve training, (iii) can compare datasets even if their label sets are completely disjoint and (iv) has solid theoretical footing. This distance relies on optimal transport, which provides it with rich geometry awareness, interpretable correspondences and well-understood properties. Our results show that this novel distance provides meaningful comparison of datasets, and correlates well with transfer learning hardness across various experimental settings and datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Optimal Transport with Some Directed Distances
    Stummer, Wolfgang
    GEOMETRIC SCIENCE OF INFORMATION (GSI 2021), 2021, 12829 : 829 - 840
  • [2] Tropical optimal transport and Wasserstein distances
    Lee W.
    Li W.
    Lin B.
    Monod A.
    Information Geometry, 2022, 5 (1) : 247 - 287
  • [3] RADAR Emitter Classification with Optimal Transport Distances
    Mottier, Manon
    Chardon, Gilles
    Pascal, Frederic
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1871 - 1875
  • [4] Optimal Transport Distances to Characterize Electronic Excitations
    Lieberherr, Annina Z.
    Gori-Giorgi, Paola
    Giesbertz, Klaas J. H.
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2024, 20 (13) : 5635 - 5642
  • [5] Optimal distances for squirrels to transport and hoard walnuts
    Tamura, N
    Hashimoto, Y
    Hayashi, F
    ANIMAL BEHAVIOUR, 1999, 58 : 635 - 642
  • [6] Optimal transport and Wasserstein distances for causal models
    Cheridito, Patrick
    Eckstein, Stephan
    BERNOULLI, 2025, 31 (02) : 1351 - 1376
  • [7] AutoML for Outlier Detection with Optimal Transport Distances
    Singh, Prabhant
    Vanschoren, Joaquin
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 7175 - 7178
  • [8] Deinterleaving RADAR Emitters With Optimal Transport Distances
    Mottier, Manon
    Chardon, Gilles
    Pascal, Frederic
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2024, 60 (03) : 3639 - 3651
  • [9] Convolutional Wasserstein Distances: Efficient Optimal Transportation on Geometric Domains
    Solomon, Justin
    de Goes, Fernando
    Peyre, Gabriel
    Cuturi, Marco
    Butscher, Adrian
    Nguyen, Andy
    Du, Tao
    Guibas, Leonidas
    ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04):
  • [10] A Geometric Perspective on Regularized Optimal Transport
    Flavien Léger
    Journal of Dynamics and Differential Equations, 2019, 31 : 1777 - 1791