An Attention-Based Architecture for Hierarchical Classification With CNNs

被引:5
|
作者
Pizarro, Ivan [1 ]
Nanculef, Ricardo [1 ]
Valle, Carlos [2 ]
机构
[1] Univ Tecn Federico Santa Maria, Dept Informat, Valparaiso 2390123, Chile
[2] Univ Playa Ancha, Dept Data Sci & Informat, Valparaiso 2360072, Chile
关键词
Taxonomy; Measurement; Computer architecture; Training; Convolutional neural networks; Classification algorithms; Predictive models; Attention mechanisms; deep learning; hierarchical classification; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.1109/ACCESS.2023.3263472
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Branch Convolutional Neural Nets have become a popular approach for hierarchical classification in computer vision and other areas. Unfortunately, these models often led to hierarchical inconsistency: predictions for the different hierarchy levels do not necessarily respect the class-subclass constraints imposed by the hierarchy. Several architectures to connect the branches have arisen to overcome this limitation. In this paper, we propose a more straightforward and flexible method: let the neural net decide how these branches must be connected. We achieve this by formulating an attention mechanism that dynamically determines how branches influence each other during training and inference. Experiments on image classification benchmarks show that the proposed method can outperform state-of-the-art models in terms of hierarchical performance metrics and consistency. Furthermore, although sometimes we found a slightly lower performance at the deeper level of the hierarchy, the model predicts much more accurately the ground-truth path between a concept and its ancestors in the hierarchy. This result suggests that the model does learn not only local class memberships but also hierarchical dependencies between concepts.
引用
收藏
页码:32972 / 32995
页数:24
相关论文
共 50 条
  • [41] HiRXN: Hierarchical Attention-Based Representation Learning for Chemical Reaction
    Cao, Yahui
    Zhang, Tao
    Zhao, Xin
    Li, Haotong
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2025, 65 (04) : 1990 - 2002
  • [42] A Deep Hybrid Pooling Architecture for Graph Classification with Hierarchical Attention
    Bandyopadhyay, Sambaran
    Aggarwal, Manasvi
    Murty, M. Narasimha
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT I, 2021, 12712 : 554 - 565
  • [43] TemporalHAN: Hierarchical attention-based heterogeneous temporal network embedding
    Mo, Xian
    Wan, Binyuan
    Tang, Rui
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [44] HAIF: A Hierarchical Attention-Based Model of Filtering Invalid Webpage
    Zhou, Chaoran
    Zhao, Jianping
    Ma, Tai
    Zhou, Xin
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (05) : 659 - 668
  • [45] An Evolutionary Attention-Based Network for Medical Image Classification
    Zhu, Hengde
    Wang, Jian
    Wang, Shui-Hua
    Raman, Rajeev
    Gorriz, Juan M.
    Zhang, Yu-Dong
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2023, 33 (03)
  • [46] Attention-Based Memory Network for Text Sentiment Classification
    Han, Hu
    Liu, Jin
    Liu, Guoli
    IEEE ACCESS, 2018, 6 : 68302 - 68310
  • [47] A Hierarchical Multimodal Attention-based Neural Network for Image Captioning
    Cheng, Yong
    Huang, Fei
    Zhou, Lian
    Jin, Cheng
    Zhang, Yuejie
    Zhang, Tao
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 889 - 892
  • [48] A visual attention-based keyword extraction for document classification
    Wu, Xing
    Du, Zhikang
    Guo, Yike
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (19) : 25355 - 25367
  • [49] NASABN: A Neural Architecture Search Framework for Attention-Based Networks
    Jing, Kun
    Xu, Jungang
    Xu, Hui
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [50] Transformer-Based Fused Attention Combined with CNNs for Image Classification
    Jielin Jiang
    Hongxiang Xu
    Xiaolong Xu
    Yan Cui
    Jintao Wu
    Neural Processing Letters, 2023, 55 : 11905 - 11919