Towards a rigorous analysis of mutual information in contrastive learning

被引：0

作者：

Lee, Kyungeun ^{[1
,4
]}

Kim, Jaeill ^{[1
]}

Kang, Suhyun ^{[1
]}

Rhee, Wonjong ^{[1
,2
,3
]}

机构：

[1] Seoul Natl Univ, Dept Intelligence & Informat, 1 Gwanak Ro, Seoul 08826, South Korea

[2] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, 1 Gwanak Ro, Seoul 08826, South Korea

[3] Seoul Natl Univ, AI Inst, 1 Gwanak Ro, Seoul 08826, South Korea

[4] LG AI Res, 150 Magokjungang Ro, Seoul 07789, South Korea

来源：

NEURAL NETWORKS | 2024年 / 179卷

基金：

新加坡国家研究基金会;

关键词：

Representation learning; Contrastive learning; Mutual information; Unsupervised learning;

D O I：

10.1016/j.neunet.2024.106584

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Contrastive learning has emerged as a cornerstone in unsupervised representation learning. Its primary paradigm involves an instance discrimination task utilizing InfoNCE loss where the loss has been proven to be a form of mutual information. Consequently, it has become a common practice to analyze contrastive learning using mutual information as a measure. Yet, this analysis approach presents difficulties due to the necessity of estimating mutual information for real-world applications. This creates a gap between the elegance of its mathematical foundation and the complexity of its estimation, thereby hampering the ability to derive solid and meaningful insights from mutual information analysis. In this study, we introduce three novel methods and a few related theorems, aimed at enhancing the rigor of mutual information analysis. Despite their simplicity, these methods can carry substantial utility. Leveraging these approaches, we reassess three instances of contrastive learning analysis, illustrating the capacity of the proposed methods to facilitate deeper comprehension or to rectify pre-existing misconceptions. The main results can be summarized as follows: (1) While small batch sizes influence the range of training loss, they do not inherently limit learned representation's information content or affect downstream performance adversely; (2) Mutual information, with careful selection of positive pairings and post-training estimation, proves to be a superior measure for evaluating practical networks; and (3) Distinguishing between task-relevant and irrelevant information presents challenges, yet irrelevant information sources do not necessarily compromise the generalization of downstream tasks.

引用

页数：17

共 50 条

[41] Towards a Unified Framework of Contrastive Learning for Disentangled Representations
Matthes, Stefan
Han, Zhiwei
Shen, Hao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[42] Feature-Contrastive Graph Federated Learning: Responsible AI in Graph Information Analysis
Zeng, Xingjie
Zhou, Tao
Bao, Zhicheng
Zhao, Hongwei
Chen, Leiming
Wang, Xiao
Wang, Feiyue
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (06) : 2938 - 2948
[43] A Multimodal Sentiment Analysis Model Enhanced with Non-verbal Information and Contrastive Learning
Liu, Jia
Song, Hong
Chen, Dapeng
Wang, Bin
Zhang, Zengwei
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (08): : 3372 - 3381
[44] InfoGCL: Information-Aware Graph Contrastive Learning
Xu, Dongkuan
Cheng, Wei
Luo, Dongsheng
Chen, Haifeng
Zhang, Xiang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[45] The Effectiveness of Graph Contrastive Learning on Mathematical Information Retrieval
Wang, Pei-Syuan
Chen, Hung-Hsuan
ADVANCES ON GRAPH-BASED APPROACHES IN INFORMATION RETRIEVAL, IRONGRAPHS 2024, 2025, 2197 : 60 - 72
[46] Information-Controllable Graph Contrastive Learning for Recommendation
Guo, Zirui
Yu, Yanhua
Wang, Yuling
Lu, Kangkang
Yang, Zixuan
Pang, Liang
Chua, Tat-Seng
PROCEEDINGS OF THE EIGHTEENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2024, 2024, : 528 - 537
[47] Language Agnostic Multilingual Information Retrieval with Contrastive Learning
Hu, Xiyang
Chen, Xinchi
Qi, Peng
Kong, Deguang
Liu, Kunlun
Wang, William Yang
Huang, Zhiheng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9133 - 9146
[48] A contrastive learning framework for safety information extraction in construction
Liu, Jiajing
Luo, Hanbin
Fang, Weili
Love, Peter E. D.
ADVANCED ENGINEERING INFORMATICS, 2023, 58
[49] Semi-Supervised Interior Decoration Style Classification with Contrastive Mutual Learning
Guo, Lichun
Zeng, Hao
Shi, Xun
Xu, Qing
Shi, Jinhui
Bai, Kui
Liang, Shuang
Hang, Wenlong
MATHEMATICS, 2024, 12 (19)
[50] Towards and Efficient Algorithm for Computing the Reduced Mutual Information
Renedo-Mirambell, Mart Prime I.
Arratia, Argimiro
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2022, 356 : 168 - 171

← 1 2 3 4 5 →