Knowledge Distillation via Token-Level Relationship Graph Based on the Big Data Technologies

被引:3
|
作者
Zhang, Shuoxi [1 ]
Liu, Hanpeng [1 ]
He, Kun [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, 1037 Luoyu Rd, Wuhan 430074, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
Knowledge distillation; Graph representation; Graph-based distillation; Big data technology; NEURAL-NETWORKS;
D O I
10.1016/j.bdr.2024.100438
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the big data era, characterized by vast volumes of complex data, the efficiency of machine learning models is of utmost importance, particularly in the context of intelligent agriculture. Knowledge distillation (KD), a technique aimed at both model compression and performance enhancement, serves as a pivotal solution by distilling the knowledge from an elaborate model (teacher) to a lightweight, compact counterpart (student). However, the true potential of KD has not been fully explored. Existing approaches primarily focus on transferring instancelevel information by big data technologies, overlooking the valuable information embedded in token-level relationships, which may be particularly affected by the long-tail effects. To address the above limitations, we propose a novel method called Knowledge Distillation with Token-level Relationship Graph (TRG) that leverages token-wise relationships to enhance the performance of knowledge distillation. By employing TRG, the student model can effectively emulate higher-level semantic information from the teacher model, resulting in improved performance and mobile-friendly efficiency. To further enhance the learning process, we introduce a dynamic temperature adjustment strategy, which encourages the student model to capture the topology structure of the teacher model more effectively. We conduct experiments to evaluate the effectiveness of the proposed method against several state-of-the-art approaches. Empirical results demonstrate the superiority of TRG across various visual tasks, including those involving imbalanced data. Our method consistently outperforms the existing baselines, establishing a new state-of-the-art performance in the field of KD based on big data technologies.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation
    Wei, Jingxuan
    Sun, Linzhuang
    Leng, Yichong
    Tan, Xu
    Yu, Bihui
    Guo, Ruifeng
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 6531 - 6540
  • [2] Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion
    Sun, Hao
    Tan, Xu
    Gan, Jun-Wei
    Liu, Hongzhi
    Zhao, Sheng
    Qin, Tao
    Liu, Tie-Yan
    INTERSPEECH 2019, 2019, : 2115 - 2119
  • [3] Knowledge Distillation via Instance Relationship Graph
    Liu, Yufan
    Cao, Jiajiong
    Li, Bing
    Yuan, Chunfeng
    Hu, Weiming
    Li, Yangxi
    Duan, Yunqiang
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7099 - 7107
  • [4] Relation Extraction via Attention-Based CNNs using Token-Level Representations
    Wang, Yan
    Xin, Xin
    Guo, Ping
    2019 15TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2019), 2019, : 113 - 117
  • [5] TOKEN-LEVEL INTERPOLATION FOR CLASS-BASED LANGUAGE MODELS
    Levit, Michael
    Stolcke, Andreas
    Chang, Shuangyu
    Parthasarathy, Sarangarajan
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5426 - 5430
  • [6] KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
    Choi, Sehyun
    Fang, Tianqing
    Wang, Zhaowei
    Song, Yangqiu
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14035 - 14053
  • [7] TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
    Choi, Hyeong Kyu
    Choi, Joonmyung
    Kim, Hyunwoo J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] MIL-Decoding: Detoxifying Language Models at Token-Level via Multiple Instance Learning
    Zhang, Xu
    Wan, Xiaojun
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 190 - 202
  • [9] Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification
    Fadeeva, Ekaterina
    Rubashevskii, Aleksandr
    Shelmanov, Artem
    Petrakov, Sergey
    Li, Haonan
    Mubarak, Hamdy
    Tsymbalov, Evgenii
    Kuzmin, Gleb
    Panchenko, Alexander
    Baldwin, Timothy
    Nakov, Preslav
    Panov, Maxim
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 9367 - 9385
  • [10] Effectiveness of Image-Based Deep Learning on Token-Level Software Vulnerability Detection
    Johnson, Dylan
    McDonald, Jeffrey T.
    Benton, Ryan G.
    Bourrie, David
    SOUTHEASTCON 2024, 2024, : 1054 - 1063