Self-attention in Knowledge Tracing: Why It Works

被引:3
|
作者
Pu, Shi [1 ]
Becker, Lee [1 ]
机构
[1] Educ Testing Serv, 660 Rosedale Rd, Princeton, NJ 08540 USA
来源
ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I | 2022年 / 13355卷
关键词
Deep knowledge tracing; Self-attention; Knowledge tracing;
D O I
10.1007/978-3-031-11644-5_75
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge tracing refers to the dynamic assessment of a learner's mastery of skills. There has been widespread adoption of the self-attention mechanism in knowledge-tracing models in recent years. These models consistently report performance gains over baseline knowledge tracing models in public datasets. However, why the self-attention mechanism works in knowledge tracing is unknown. This study argues that the ability to encode when a learner attempts to answer the same item multiple times in a row (henceforth referred to as repeated attempts) is a significant reason why self-attention models perform better than other deep knowledge tracing models. We present two experiments to support our argument. We use context-aware knowledge tracing (AKT) as our example self-attention model and dynamic key-value memory networks (DKVMN) and deep performance factors analysis (DPFA) as our baseline models. Firstly, we show that removing repeated attempts from datasets closes the performance gap between the AKT and the baseline models. Secondly, we present DPFA+, an extension of DPFA that is able to consume manually crafted repeated attempts features. We demonstrate that DPFA+ performs better than AKT across all datasets with manually crafted repeated attempts features.
引用
收藏
页码:731 / 736
页数:6
相关论文
共 50 条
  • [21] On the Integration of Self-Attention and Convolution
    Pan, Xuran
    Ge, Chunjiang
    Lu, Rui
    Song, Shiji
    Chen, Guanfu
    Huang, Zeyi
    Huang, Gao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 805 - 815
  • [22] On The Computational Complexity of Self-Attention
    Keles, Feyza Duman
    Wijewardena, Pruthuvi Mahesakya
    Hegde, Chinmay
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 597 - 619
  • [23] The Lipschitz Constant of Self-Attention
    Kim, Hyunjik
    Papamakarios, George
    Mnih, Andriy
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [24] The function of the self-attention network
    Cunningham, Sheila J.
    COGNITIVE NEUROSCIENCE, 2016, 7 (1-4) : 21 - 22
  • [25] Convolutional Self-Attention Networks
    Yang, Baosong
    Wang, Longyue
    Wong, Derek F.
    Chao, Lidia S.
    Tu, Zhaopeng
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4040 - 4045
  • [26] Self-Attention Graph Pooling
    Lee, Junhyun
    Lee, Inyeop
    Kang, Jaewoo
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [27] FOCUS OF ATTENTION IN GROUPS - A SELF-ATTENTION PERSPECTIVE
    MULLEN, B
    CHAPMAN, JG
    PEAUGH, S
    JOURNAL OF SOCIAL PSYCHOLOGY, 1989, 129 (06): : 807 - 817
  • [28] Masked face recognition based on knowledge distillation and convolutional self-attention network
    Wan, Weiguo
    Wen, Runlin
    Yao, Li
    Yang, Yong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 2269 - 2284
  • [29] SEMANTIC IMAGES SEGMENTATION FOR AUTONOMOUS DRIVING USING SELF-ATTENTION KNOWLEDGE DISTILLATION
    Karine, Ayoub
    Napoleon, Thibault
    Jridi, Maher
    2022 16TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS, SITIS, 2022, : 198 - 202
  • [30] Knowledge-Aware Self-Attention Networks for Document Grounded Dialogue Generation
    Tang, Xiangru
    Hu, Po
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 400 - 411