Recasting Self-Attention with Holographic Reduced Representations

被引:0
|
作者
Alam, Mohammad Mahmudul [1 ]
Raff, Edward [1 ,2 ,3 ]
Biderman, Stella [2 ,3 ,4 ]
Oates, Tim [1 ]
Holt, James [2 ]
机构
[1] Univ Maryland Baltimore Cty, Dept Comp Sci & Elect Engn, Baltimore, MD 21228 USA
[2] Lab Phys Sci, College Pk, MD 20740 USA
[3] Booz Allen Hamilton, Mclean, VA 22102 USA
[4] EleutherAI, New York, NY USA
关键词
DETECT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, self-attention has become the dominant paradigm for sequence modeling in a variety of domains. However, in domains with very long sequence lengths the O(T-2) memory and O((TH)-H-2) compute costs can make using transformers infeasible. Motivated by problems in malware detection, where sequence lengths of T >= 100, 000 are a roadblock to deep learning, we re-cast self-attention using the neuro-symbolic approach of Holographic Reduced Representations (HRR). In doing so we perform the same high-level strategy of the standard self-attention: a set of queries matching against a set of keys, and returning a weighted response of the values for each key. Implemented as a "Hrrformer" we obtain several benefits including O(TH logH) time complexity, O(TH) space complexity, and convergence in 10x fewer epochs. Nevertheless, the Hrrformer achieves near state-of-the-art accuracy on LRA benchmarks and we are able to learn with just a single layer. Combined, these benefits make our Hrrformer the first viable Transformer for such long malware classification sequences and up to 280x faster to train on the Long Range Arena benchmark. Code is available at https: //github. com/NeuromorphicComputa tionResearchProgram/Hrrformer
引用
收藏
页码:490 / 507
页数:18
相关论文
共 50 条
  • [21] Encoding Structure in Holographic Reduced Representations
    Kelly, Matthew A.
    Blostein, Dorothea
    Mewhort, D. J. K.
    CANADIAN JOURNAL OF EXPERIMENTAL PSYCHOLOGY-REVUE CANADIENNE DE PSYCHOLOGIE EXPERIMENTALE, 2013, 67 (02): : 79 - 93
  • [22] Audio Fingerprinting with Holographic Reduced Representations
    Fujita, Yusuke
    Komatsu, Tatsuya
    INTERSPEECH 2024, 2024, : 62 - 66
  • [23] SELF-ATTENTION, CONCEPT ACTIVATION, AND THE CAUSAL SELF
    FENIGSTEIN, A
    LEVINE, MP
    JOURNAL OF EXPERIMENTAL SOCIAL PSYCHOLOGY, 1984, 20 (03) : 231 - 245
  • [24] A relational extraction approach based on multiple embedding representations and multi-head self-attention
    Qin, Zhi
    Liu, Enyang
    Zhang, Shibin
    Chang, Yan
    Yan, Lili
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (03) : 7093 - 7107
  • [25] Integrating the Pre-trained Item Representations with Reformed Self-attention Network for Sequential Recommendation
    Liang, Guanzhong
    Liao, Jie
    Zhou, Wei
    Wen, Junhao
    2022 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (IEEE ICWS 2022), 2022, : 27 - 36
  • [26] Self-Attention Based Action Segmentation Using Intra-And Inter-Segment Representations
    Patsch, Constantin
    Steinbach, Eckehard
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2023,
  • [27] Research of Self-Attention in Image Segmentation
    Cao, Fude
    Zheng, Chunguang
    Huang, Limin
    Wang, Aihua
    Zhang, Jiong
    Zhou, Feng
    Ju, Haoxue
    Guo, Haitao
    Du, Yuxia
    JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2022, 15 (01)
  • [28] Improve Image Captioning by Self-attention
    Li, Zhenru
    Li, Yaoyi
    Lu, Hongtao
    NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 91 - 98
  • [29] Self-Attention Generative Adversarial Networks
    Zhang, Han
    Goodfellow, Ian
    Metaxas, Dimitris
    Odena, Augustus
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [30] Rethinking the Self-Attention in Vision Transformers
    Kim, Kyungmin
    Wu, Bichen
    Dai, Xiaoliang
    Zhang, Peizhao
    Yan, Zhicheng
    Vajda, Peter
    Kim, Seon
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3065 - 3069