Towards a rigorous analysis of mutual information in contrastive learning

被引：0

作者：

Lee, Kyungeun ^{[1
,4
]}

Kim, Jaeill ^{[1
]}

Kang, Suhyun ^{[1
]}

Rhee, Wonjong ^{[1
,2
,3
]}

机构：

[1] Seoul Natl Univ, Dept Intelligence & Informat, 1 Gwanak Ro, Seoul 08826, South Korea

[2] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, 1 Gwanak Ro, Seoul 08826, South Korea

[3] Seoul Natl Univ, AI Inst, 1 Gwanak Ro, Seoul 08826, South Korea

[4] LG AI Res, 150 Magokjungang Ro, Seoul 07789, South Korea

来源：

NEURAL NETWORKS | 2024年 / 179卷

基金：

新加坡国家研究基金会;

关键词：

Representation learning; Contrastive learning; Mutual information; Unsupervised learning;

D O I：

10.1016/j.neunet.2024.106584

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Contrastive learning has emerged as a cornerstone in unsupervised representation learning. Its primary paradigm involves an instance discrimination task utilizing InfoNCE loss where the loss has been proven to be a form of mutual information. Consequently, it has become a common practice to analyze contrastive learning using mutual information as a measure. Yet, this analysis approach presents difficulties due to the necessity of estimating mutual information for real-world applications. This creates a gap between the elegance of its mathematical foundation and the complexity of its estimation, thereby hampering the ability to derive solid and meaningful insights from mutual information analysis. In this study, we introduce three novel methods and a few related theorems, aimed at enhancing the rigor of mutual information analysis. Despite their simplicity, these methods can carry substantial utility. Leveraging these approaches, we reassess three instances of contrastive learning analysis, illustrating the capacity of the proposed methods to facilitate deeper comprehension or to rectify pre-existing misconceptions. The main results can be summarized as follows: (1) While small batch sizes influence the range of training loss, they do not inherently limit learned representation's information content or affect downstream performance adversely; (2) Mutual information, with careful selection of positive pairings and post-training estimation, proves to be a superior measure for evaluating practical networks; and (3) Distinguishing between task-relevant and irrelevant information presents challenges, yet irrelevant information sources do not necessarily compromise the generalization of downstream tasks.

引用

页数：17

共 50 条

[31] Causal Analysis of Learning Performance Based on Bayesian Network and Mutual Information
Chen, Jing
Feng, Jun
Hu, Jingzhao
Sun, Xia
ENTROPY, 2019, 21 (11)
[32] Mutual mentor: Online contrastive distillation network for general continual learning
Wang, Qiang
Ji, Zhong
Li, Jin
Pang, Yanwei
NEUROCOMPUTING, 2023, 537 : 37 - 48
[33] DCML: Deep contrastive mutual learning for COVID-19 recognition
Zhang, Hongbin
Liang, Weinan
Li, Chuanxiu
Xiong, Qipeng
Shi, Haowei
Hu, Lang
Li, Guangli
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 77
[34] Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition
Yang, Chuanguang
An, Zhulin
Zhou, Helong
Zhuang, Fuzhen
Xu, Yongjun
Zhang, Qian
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 10212 - 10227
[35] Learning curves for mutual information maximization
Urbanczik, R
PHYSICAL REVIEW E, 2003, 68 (01): : 161061 - 161066
[36] Competitive learning by mutual information maximization
Kamimura, R
Kamimura, T
IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 926 - 931
[37] Learning Speaker Representations with Mutual Information
Ravanelli, Mirco
Bengio, Yoshua
INTERSPEECH 2019, 2019, : 1153 - 1157
[38] Mutual Information Driven Federated Learning
Uddin, Md Palash
Xiang, Yong
Lu, Xuequan
Yearwood, John
Gao, Longxiang
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (07) : 1526 - 1538
[39] MeshCL: Towards robust 3D mesh analysis via contrastive learning
Liang, Yaqian
He, Fazhi
Fan, Bo
Tang, Wei
ADVANCED ENGINEERING INFORMATICS, 2024, 60
[40] Towards Sharper Generalization Bounds for Adversarial Contrastive Learning
Wen, Wen
Li, Han
Gong, Tieliang
Chen, Hong
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 5190 - 5198

← 1 2 3 4 5 →