Towards a rigorous analysis of mutual information in contrastive learning

被引：0

作者：

Lee, Kyungeun ^{[1
,4
]}

Kim, Jaeill ^{[1
]}

Kang, Suhyun ^{[1
]}

Rhee, Wonjong ^{[1
,2
,3
]}

机构：

[1] Seoul Natl Univ, Dept Intelligence & Informat, 1 Gwanak Ro, Seoul 08826, South Korea

[2] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, 1 Gwanak Ro, Seoul 08826, South Korea

[3] Seoul Natl Univ, AI Inst, 1 Gwanak Ro, Seoul 08826, South Korea

[4] LG AI Res, 150 Magokjungang Ro, Seoul 07789, South Korea

来源：

NEURAL NETWORKS | 2024年 / 179卷

基金：

新加坡国家研究基金会;

关键词：

Representation learning; Contrastive learning; Mutual information; Unsupervised learning;

D O I：

10.1016/j.neunet.2024.106584

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Contrastive learning has emerged as a cornerstone in unsupervised representation learning. Its primary paradigm involves an instance discrimination task utilizing InfoNCE loss where the loss has been proven to be a form of mutual information. Consequently, it has become a common practice to analyze contrastive learning using mutual information as a measure. Yet, this analysis approach presents difficulties due to the necessity of estimating mutual information for real-world applications. This creates a gap between the elegance of its mathematical foundation and the complexity of its estimation, thereby hampering the ability to derive solid and meaningful insights from mutual information analysis. In this study, we introduce three novel methods and a few related theorems, aimed at enhancing the rigor of mutual information analysis. Despite their simplicity, these methods can carry substantial utility. Leveraging these approaches, we reassess three instances of contrastive learning analysis, illustrating the capacity of the proposed methods to facilitate deeper comprehension or to rectify pre-existing misconceptions. The main results can be summarized as follows: (1) While small batch sizes influence the range of training loss, they do not inherently limit learned representation's information content or affect downstream performance adversely; (2) Mutual information, with careful selection of positive pairings and post-training estimation, proves to be a superior measure for evaluating practical networks; and (3) Distinguishing between task-relevant and irrelevant information presents challenges, yet irrelevant information sources do not necessarily compromise the generalization of downstream tasks.

引用

页数：17

共 50 条

[21] Towards Domain-Agnostic Contrastive Learning
Verma, Vikas
Minh-Thang Luong
Kawaguchi, Kenji
Hieu Pham
Le, Quoc, V
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7544 - 7554
[22] Towards Understanding the Mechanism of Contrastive Learning via Similarity Structure: A Theoretical Analysis
Waida, Hiroki
Wada, Yuichiro
Andeol, Leo
Nakagawa, Takumi
Zhang, Yuhui
Kanamori, Takafumi
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT IV, 2023, 14172 : 709 - 727
[23] CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
Cheng, Pengyu
Hao, Weituo
Dai, Shuyang
Liu, Jiachang
Gan, Zhe
Carin, Lawrence
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[24] Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization
Guo, Qing
Chen, Junya
Wang, Dong
Wang, Yuewei
Deng, Xinwei
Carin, Lawrence
Li, Fan
Huang, Jing
Tao, Chenyang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[25] CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
Cheng, Pengyu
Hao, Weituo
Dai, Shuyang
Liu, Jiachang
Gan, Zhe
Carin, Lawrence
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[26] Hypergraph contrastive learning for recommendation with side information
Ao, Dun
Cao, Qian
Wang, Xiaofeng
INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2024, 17 (04) : 657 - 670
[27] Complex network structural analysis based on information supplementation graph contrastive learning
Cai, Biao
Wang, Jian
Tang, Xiaochuan
Li, Xu
Hu, Nengbin
Hu, Yanmei
Liu, Mingzhe
Miao, Qiang
KNOWLEDGE-BASED SYSTEMS, 2025, 309
[28] Graph contrastive learning with multiple information fusion
Wang, Xiaobao
Yang, Jun
Wang, Zhiqiang
He, Dongxiao
Zhao, Jitao
Huang, Yuxiao
Jin, Di
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 268
[29] Adversarial Graph Contrastive Learning with Information Regularization
Feng, Shengyu
Jing, Baoyu
Zhu, Yada
Tong, Hanghang
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 1362 - 1371
[30] JointContrast: Skeleton-Based Mutual Action Recognition with Contrastive Learning
Jia, Xiangze
Zhang, Ji
Wang, Zhen
Luo, Yonglong
Chen, Fulong
Xiao, Jing
PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 478 - 489

← 1 2 3 4 5 →