Adaptively Denoising Graph Neural Networks for Knowledge Distillation

被引：0

作者：

Guo, Yuxin ^{[1
]}

Yang, Cheng ^{[1
]}

Shi, Chuan ^{[1
]}

Tu, Ke ^{[2
]}

Wu, Zhengwei ^{[2
]}

Zhang, Zhiqiang ^{[2
]}

Zhou, Jun ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

[2] Ant Financial, Hangzhou, Peoples R China

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-RESEARCH TRACK AND DEMO TRACK, PT VIII, ECML PKDD 2024 | 2024年 / 14948卷

基金：

中国国家自然科学基金;

关键词：

Graph Neural Networks; Knowledge Distillation;

D O I：

10.1007/978-3-031-70371-3_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Graph Neural Networks (GNNs) have excelled in various graph-based applications. Recently, knowledge distillation (KD) has provided a new approach to further boost GNNs performance. However, in the KD process, the GNN student may encounter noise issues while learning from GNN teacher and input graph. GNN teachers may carry noise as deep models inevitably introduce noise during training, leading to error propagation in GNN students. Besides, noisy structures in input graph may also disrupt information during message-passing in GNNs. Hence, we propose DKDG to adaptively remove noise in GNN teacher and graph structure for better distillation. DKDG comprises two modules: (1) teacher knowledge denoising module, which separates GNN teacher knowledge into noise and label knowledge, and removes parameters fitting noise knowledge in the GNN student. (2) graph structure denoising module is designed to enhance node representations discrimination. Detailly, we propose a discrimination-preserving objective based on total variation loss and update edge weights between adjacent nodes to minimize this objective. These two modules are integrated through GNN's forward propagation and trained iteratively. Experiments on five benchmark datasets and three GNNs demonstrate the GNN student distilled by DKDG gains 1.86% relative improvement compared to the best baseline of recent state-of-the-art GNN-based KD methods.

引用

页码：253 / 269

页数：17

共 50 条

[31] Knowledge distillation on neural networks for evolving graphs
Antaris, Stefanos
Rafailidis, Dimitrios
Girdzijauskas, Sarunas
SOCIAL NETWORK ANALYSIS AND MINING, 2021, 11 (01)
[32] Knowledge distillation on neural networks for evolving graphs
Stefanos Antaris
Dimitrios Rafailidis
Sarunas Girdzijauskas
Social Network Analysis and Mining, 2021, 11
[33] Cybersecurity Knowledge Graph Improvement with Graph Neural Networks
Dasgupta, Soham
Piplai, Aritran
Ranade, Priyanka
Joshi, Anupam
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 3290 - 3297
[34] Enhanced Scalable Graph Neural Network via Knowledge Distillation
Mai, Chengyuan
Chang, Yaomin
Chen, Chuan
Zheng, Zibin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1258 - 1271
[35] Enhanced Scalable Graph Neural Network via Knowledge Distillation
Mai, Chengyuan
Chang, Yaomin
Chen, Chuan
Zheng, Zibin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1258 - 1271
[36] Controlled graph neural networks with denoising diffusion for anomaly detection
Li, Xuan
Xiao, Chunjing
Feng, Ziliang
Pang, Shikang
Tai, Wenxin
Zhou, Fan
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[37] Explaining Neural Networks Using Attentive Knowledge Distillation
Lee, Hyeonseok
Kim, Sungchan
SENSORS, 2021, 21 (04) : 1 - 17
[38] Knowledge Distillation for Optimization of Quantized Deep Neural Networks
Shin, Sungho
Boo, Yoonho
Sung, Wonyong
2020 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2020, : 111 - 116
[39] Improving the Interpretability of Deep Neural Networks with Knowledge Distillation
Liu, Xuan
Wang, Xiaoguang
Matwin, Stan
2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 905 - 912
[40] PDD: Pruning Neural Networks During Knowledge Distillation
Dan, Xi
Yang, Wenjie
Zhang, Fuyan
Zhou, Yihang
Yu, Zhuojun
Qiu, Zhen
Zhao, Boyuan
Dong, Zeyu
Huang, Libo
Yang, Chuanguang
COGNITIVE COMPUTATION, 2024, 16 (06) : 3457 - 3467

← 1 2 3 4 5 →