An Attention-Based Architecture for Hierarchical Classification With CNNs

被引:5
|
作者
Pizarro, Ivan [1 ]
Nanculef, Ricardo [1 ]
Valle, Carlos [2 ]
机构
[1] Univ Tecn Federico Santa Maria, Dept Informat, Valparaiso 2390123, Chile
[2] Univ Playa Ancha, Dept Data Sci & Informat, Valparaiso 2360072, Chile
关键词
Taxonomy; Measurement; Computer architecture; Training; Convolutional neural networks; Classification algorithms; Predictive models; Attention mechanisms; deep learning; hierarchical classification; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.1109/ACCESS.2023.3263472
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Branch Convolutional Neural Nets have become a popular approach for hierarchical classification in computer vision and other areas. Unfortunately, these models often led to hierarchical inconsistency: predictions for the different hierarchy levels do not necessarily respect the class-subclass constraints imposed by the hierarchy. Several architectures to connect the branches have arisen to overcome this limitation. In this paper, we propose a more straightforward and flexible method: let the neural net decide how these branches must be connected. We achieve this by formulating an attention mechanism that dynamically determines how branches influence each other during training and inference. Experiments on image classification benchmarks show that the proposed method can outperform state-of-the-art models in terms of hierarchical performance metrics and consistency. Furthermore, although sometimes we found a slightly lower performance at the deeper level of the hierarchy, the model predicts much more accurately the ground-truth path between a concept and its ancestors in the hierarchy. This result suggests that the model does learn not only local class memberships but also hierarchical dependencies between concepts.
引用
收藏
页码:32972 / 32995
页数:24
相关论文
共 50 条
  • [1] An Attention-based Architecture for EEG Classification
    Zoppis, Italo
    Zanga, Alessio
    Manzoni, Sara
    Cisotto, Giulia
    Morreale, Angela
    Stella, Fabio
    Mauri, Giancarlo
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, VOL 4: BIOSIGNALS, 2020, : 214 - 219
  • [2] Attention-based Hierarchical LSTM Model for Document Sentiment Classification
    Wang, Bo
    Fan, Binwen
    2018 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE APPLICATIONS AND TECHNOLOGIES (AIAAT 2018), 2018, 435
  • [3] Attention-Based Hierarchical Recurrent Neural Network for Phenotype Classification
    Xu, Nan
    Shen, Yanyan
    Zhu, Yanmin
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT I, 2019, 11439 : 465 - 476
  • [4] Poster Abstract: Attention-based LSTM-CNNs For Time-series Classification
    Du, Qianjin
    Gu, Weixi
    Zhang, Lin
    Huang, Shao-Lun
    SENSYS'18: PROCEEDINGS OF THE 16TH CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, 2018, : 410 - 411
  • [5] Hierarchical Attention-based Fully Convolutional Network for Satellite Cloud Classification and Detection
    Jin, Dan
    Li, Mingqiang
    2023 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS, ICCAR, 2023, : 337 - 341
  • [6] Designing self attention-based ResNet architecture for rice leaf disease classification
    Ancy Stephen
    A. Punitha
    A. Chandrasekar
    Neural Computing and Applications, 2023, 35 : 6737 - 6751
  • [7] Designing self attention-based ResNet architecture for rice leaf disease classification
    Stephen, Ancy
    Punitha, A.
    Chandrasekar, A.
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (09): : 6737 - 6751
  • [8] Attention-Based DenseNet for Pneumonia Classification
    Wang, K.
    Jiang, P.
    Meng, J.
    Jiang, X.
    IRBM, 2022, 43 (05) : 479 - 485
  • [9] An Attention-Based CNN for ECG Classification
    Kuvaev, Alexander
    Khudorozhkov, Roman
    ADVANCES IN COMPUTER VISION, CVC, VOL 1, 2020, 943 : 671 - 677
  • [10] A Hierarchical Neural Attention-based Text Classifier
    Sinha, Koustuv
    Dong, Yue
    Cheung, Jackie C. K.
    Ruths, Derek
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 817 - 823