Geometric Dataset Distances via Optimal Transport

被引:0
|
作者
Alvarez-Melis, David [1 ]
Fusi, Nicolo [1 ]
机构
[1] Microsoft Res New England, Cambridge, MA 02142 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The notion of task similarity is at the core of various machine learning paradigms, such as domain adaptation and meta-learning. Current methods to quantify it are often heuristic, make strong assumptions on the label sets across the tasks, and many are architecture-dependent, relying on task-specific optimal parameters (e.g., require training a model on each dataset). In this work we propose an alternative notion of distance between datasets that (i) is model-agnostic, (ii) does not involve training, (iii) can compare datasets even if their label sets are completely disjoint and (iv) has solid theoretical footing. This distance relies on optimal transport, which provides it with rich geometry awareness, interpretable correspondences and well-understood properties. Our results show that this novel distance provides meaningful comparison of datasets, and correlates well with transfer learning hardness across various experimental settings and datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] OPTIMAL SOLUTIONS TO VARIATIONAL INEQUALITIES VIA BREGMAN DISTANCES IN BANACH LATTICES
    Naraghirad E.
    Applied Set-Valued Analysis and Optimization, 2022, 4 (01): : 95 - 107
  • [42] Scalable Optimal Transport in High Dimensions for Graph Distances, Embedding Alignment, and More
    Klicpera, Johannes
    Lienen, Marten
    Guennemann, Stephan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [43] Adaptive mesh methods on compact manifolds via Optimal Transport and Optimal Information Transport
    Turnquist A.G.R.
    Journal of Computational Physics, 2024, 500
  • [44] MAXWELL,JC GEOMETRIC MEAN DISTANCES
    EDELMANN, H
    SIEMENS FORSCHUNGS-UND ENTWICKLUNGSBERICHTE-SIEMENS RESEARCH AND DEVELOPMENT REPORTS, 1981, 10 (03): : 133 - 138
  • [45] Geometric distances between closed universes
    Suvorov, Arthur G.
    PHYSICAL REVIEW D, 2025, 111 (02)
  • [46] RIEMANNIAN DISTANCES BETWEEN GEOMETRIC MEANS
    Lim, Yongdo
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2013, 34 (03) : 932 - 945
  • [47] Planning geometric constraint decomposition via optimal graph transformations
    Hoffmann, CM
    Lomonosov, A
    Sitharam, M
    APPLICATIONS OF GRAPH TRANSFORMATIONS WITH INDUSTRIAL RELEVANCE, PROCEEDINGS, 2000, 1779 : 309 - 324
  • [48] Learning to Match via Inverse Optimal Transport
    Li, Ruilin
    Ye, Xiaojing
    Zhou, Haomin
    Zha, Hongyuan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [49] Learning to Count via Unbalanced Optimal Transport
    Ma, Zhiheng
    Wei, Xing
    Hong, Xiaopeng
    Lin, Hui
    Qiu, Yunfeng
    Gong, Yihong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2319 - 2327
  • [50] Visual Prompting via Partial Optimal Transport
    Zheng, Mengyu
    Hao, Zhiwei
    Tang, Yehui
    Xu, Chang
    COMPUTER VISION-ECCV 2024, PT XXXV, 2025, 15093 : 1 - 18