Robust Representation Learning via Sparse Attention Mechanism for Similarity Models

被引:0
|
作者
Ermilova, Alina [1 ]
Baramiia, Nikita [1 ]
Kornilov, Valerii [1 ]
Petrakov, Sergey [1 ]
Zaytsev, Alexey [1 ,2 ]
机构
[1] Skolkovo Inst Sci & Technol, Moscow 121205, Russia
[2] Sber, Risk Management, Moscow 121165, Russia
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Transformers; Oil insulation; Task analysis; Time series analysis; Meteorology; Training; Deep learning; Representation learning; efficient transformer; robust transformer; representation learning; similarity learning; TRANSFORMER;
D O I
10.1109/ACCESS.2024.3418779
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The attention-based models are widely used for time series data. However, due to the quadratic complexity of attention regarding input sequence length, the application of Transformers is limited by high resource demands. Moreover, their modifications for industrial time series need to be robust to missing or noisy values, which complicates the expansion of their application horizon. To cope with these issues, we introduce the class of efficient Transformers named Regularized Transformers (Reguformers). We implement the regularization technique inspired by the dropout ideas to improve robustness and reduce computational expenses without significantly modifying the pipeline. The focus in our experiments is on oil&gas data. For well-interval similarity task, our best Reguformer configuration reaches ROC AUC 0.97, which is comparable to Informer (0.978) and outperforms baselines: the previous LSTM model (0.934), the classical Transformer model (0.967), and three recent most promising modifications of the original Transformer, namely, Performer (0.949), LRformer (0.955), and DropDim (0.777). We also conduct the corresponding experiments on three additional datasets from different domains and obtain superior results. The increase in the quality of the best Reguformer relative to Transformer for different datasets varies from 3.7% to 9.6%, while the increase range relative to Informer is wider: from 1.7% to 18.4%.
引用
收藏
页码:97833 / 97850
页数:18
相关论文
共 50 条
  • [41] Integrated Sparse Coding With Graph Learning for Robust Data Representation
    Zhang, Yupei
    Liu, Shuhui
    IEEE ACCESS, 2020, 8 : 161245 - 161260
  • [42] Robust visual tracking based on online learning sparse representation
    Zhang, Shengping
    Yao, Hongxun
    Zhou, Huiyu
    Sun, Xin
    Liu, Shaohui
    NEUROCOMPUTING, 2013, 100 : 31 - 40
  • [43] Learning double weights via data augmentation for robust sparse and collaborative representation-based classification
    Shaoning Zeng
    Bob Zhang
    Jianping Gou
    Multimedia Tools and Applications, 2020, 79 : 20617 - 20638
  • [44] Label Distribution Learning via Sample Sparse Representation
    Shao J.
    Yuan S.
    Liu X.
    Liu R.
    Yuan, Sheng, 2020, Xi'an Jiaotong University (54): : 139 - 148
  • [45] LEARNING DICTIONARY VIA SUBSPACE SEGMENTATION FOR SPARSE REPRESENTATION
    Feng, Jianzhou
    Song, Li
    Yang, Xiaokang
    Zhang, Wenjun
    2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 1245 - 1248
  • [46] Efficient Network Representation Learning via Cluster Similarity
    Fujiwara, Yasuhiro
    Ida, Yasutoshi
    Kumagai, Atsutoshi
    Nakano, Masahiro
    Kimura, Akisato
    Ueda, Naonori
    DATA SCIENCE AND ENGINEERING, 2023, 8 (3) : 279 - 291
  • [47] Representation Learning via Cauchy Convolutional Sparse Coding
    Mayo, Perla
    Karakus, Oktay
    Holmes, Robin
    Achim, Alin
    IEEE ACCESS, 2021, 9 (09): : 100447 - 100459
  • [48] Deformable segmentation via sparse representation and dictionary learning
    Zhang, Shaoting
    Zhan, Yiqiang
    Metaxas, Dimitris N.
    MEDICAL IMAGE ANALYSIS, 2012, 16 (07) : 1385 - 1396
  • [49] Learning double weights via data augmentation for robust sparse and collaborative representation-based classification
    Zeng, Shaoning
    Zhang, Bob
    Gou, Jianping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (29-30) : 20617 - 20638
  • [50] Dictionary learning via locality preserving for sparse representation
    School of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, China
    不详
    Huanan Ligong Daxue Xuebao, 1 (142-146):