Knowledge Distillation via Token-Level Relationship Graph Based on the Big Data Technologies

被引:3
|
作者
Zhang, Shuoxi [1 ]
Liu, Hanpeng [1 ]
He, Kun [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, 1037 Luoyu Rd, Wuhan 430074, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
Knowledge distillation; Graph representation; Graph-based distillation; Big data technology; NEURAL-NETWORKS;
D O I
10.1016/j.bdr.2024.100438
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the big data era, characterized by vast volumes of complex data, the efficiency of machine learning models is of utmost importance, particularly in the context of intelligent agriculture. Knowledge distillation (KD), a technique aimed at both model compression and performance enhancement, serves as a pivotal solution by distilling the knowledge from an elaborate model (teacher) to a lightweight, compact counterpart (student). However, the true potential of KD has not been fully explored. Existing approaches primarily focus on transferring instancelevel information by big data technologies, overlooking the valuable information embedded in token-level relationships, which may be particularly affected by the long-tail effects. To address the above limitations, we propose a novel method called Knowledge Distillation with Token-level Relationship Graph (TRG) that leverages token-wise relationships to enhance the performance of knowledge distillation. By employing TRG, the student model can effectively emulate higher-level semantic information from the teacher model, resulting in improved performance and mobile-friendly efficiency. To further enhance the learning process, we introduce a dynamic temperature adjustment strategy, which encourages the student model to capture the topology structure of the teacher model more effectively. We conduct experiments to evaluate the effectiveness of the proposed method against several state-of-the-art approaches. Empirical results demonstrate the superiority of TRG across various visual tasks, including those involving imbalanced data. Our method consistently outperforms the existing baselines, establishing a new state-of-the-art performance in the field of KD based on big data technologies.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Geoscience knowledge graph in the big data era
    Chenghu ZHOU
    Hua WANG
    Chengshan WANG
    Zengqian HOU
    Zhiming ZHENG
    Shuzhong SHEN
    Qiuming CHENG
    Zhiqiang FENG
    Xinbing WANG
    Hairong LV
    Junxuan FAN
    Xiumian HU
    Mingcai HOU
    Yunqiang ZHU
    ScienceChina(EarthSciences), 2021, 64 (07) : 1105 - 1114
  • [22] Geoscience knowledge graph in the big data era
    Chenghu Zhou
    Hua Wang
    Chengshan Wang
    Zengqian Hou
    Zhiming Zheng
    Shuzhong Shen
    Qiuming Cheng
    Zhiqiang Feng
    Xinbing Wang
    Hairong Lv
    Junxuan Fan
    Xiumian Hu
    Mingcai Hou
    Yunqiang Zhu
    Science China Earth Sciences, 2021, 64 : 1105 - 1114
  • [23] A transformer based visual tracker with restricted token interaction and knowledge distillation
    Liu, Nian
    Zhang, Yi
    KNOWLEDGE-BASED SYSTEMS, 2025, 307
  • [24] Data Lakes Empowered by Knowledge Graph Technologies
    Helal, Ahmed
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2884 - 2886
  • [25] TOCOL: improving contextual representation of pre-trained language models via token-level contrastive learning
    Wang, Keheng
    Yin, Chuantao
    Li, Rumei
    Wang, Sirui
    Xian, Yunsen
    Rong, Wenge
    Xiong, Zhang
    MACHINE LEARNING, 2024, 113 (07) : 3999 - 4012
  • [26] Towards Continual Knowledge Graph Embedding via Incremental Distillation
    Liu, Jiajun
    Ke, Wenjun
    Wang, Peng
    Shang, Ziyu
    Gao, Jinhua
    Li, Guozheng
    Ji, Ke
    Liu, Yanhe
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 8759 - 8768
  • [27] Accelerating Molecular Graph Neural Networks via Knowledge Distillation
    Kelvinius, Filip Ekstrom
    Georgiev, Dimitar
    Toshev, Artur Petrov
    Gasteiger, Johannes
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36, NEURIPS 2023, 2023,
  • [28] Enhanced Scalable Graph Neural Network via Knowledge Distillation
    Mai, Chengyuan
    Chang, Yaomin
    Chen, Chuan
    Zheng, Zibin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1258 - 1271
  • [29] Boosting Graph Neural Networks via Adaptive Knowledge Distillation
    Guo, Zhichun
    Zhang, Chunhui
    Fan, Yujie
    Tian, Yijun
    Zhang, Chuxu
    Chawla, Nitesh V.
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7793 - 7801
  • [30] Enhanced Scalable Graph Neural Network via Knowledge Distillation
    Mai, Chengyuan
    Chang, Yaomin
    Chen, Chuan
    Zheng, Zibin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1258 - 1271