MulA-nnUNet: A Multi-Attention Enhanced nnUNet Framework for 3D Abdominal Multi-Organs Segmentation

被引:0
|
作者
Ding, Jiashuo [1 ]
Ni, Wei [1 ]
Wan, Jiahui [2 ]
Deng, Xiaojun [1 ]
Wan, Lanjun [1 ]
机构
[1] Hunan Univ Technol, Sch Comp Sci, Zhuzhou 412007, Peoples R China
[2] Hunan Agr Univ, Coll Mech & Elect Engn, Changsha 410125, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Image segmentation; Three-dimensional displays; Semantics; Decoding; Accuracy; Attention mechanisms; Tumors; Abdominal multi-organ image segmentation; attention mechanism; deep learning; nnUNet;
D O I
10.1109/ACCESS.2024.3437652
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the domain of medical image segmentation, the nnUNet framework is highly respected for its excellent performance and wide range of applications. However, the inherent bias of locality and weight sharing introduced by the continuous convolutional operations currently used limits the network's performance in modeling long-term dependencies. Furthermore, in the process of implementing residual links, certain limitations are encountered due to the substantial semantic discrepancy between the encoder's output feature maps and the decoder's. These limitations are seen in the direct application of skip connections for feature fusion and gradient propagation, which are known to impact the model's convergence speed and overall performance. In this paper, a novel framework is presented, namely Multi-Attention nnUNet (MulA-nnUNet), which utilizes nnUNet as the foundational network structure and integrates two key attention mechanisms: large kernel convolutional attention (LKA) and pixel attention (PA). LKA is embedded within the deep encoder, maintaining the effectiveness of shallow feature extraction and enhancing the deep neural networks' ability to understand long-range spatial dependencies. At the same time, the semantic distinction between the encoder and decoder's output map of features is decreased by the PA module, which helps to improve the effect of skip connection feature fusion. The complexity of the model is reduced by replacing the standard convolutions in the encoder and decoder layers with depthwise separable convolutions (DS), which have fewer parameters. The effectiveness of the proposed framework is confirmed by a set of ablation experiments and comparison experiments with current state-of-the-art models on the computed tomography (CT) subset of the multimodal abdominal multi-organ segmentation dataset (AMOS), which includes 500 CT scans, with 350 scans for training, 75 for validation, and 75 for testing. MulA-nnUNet shows improvements of 1.1% in mean dice similarity coefficient (mDSC) and 1.52% in mean intersection over union (mIoU), while the baseline model requires 5 times the floating point operations (FLOPs) and over 7 times the parameters (Params). Additionally, it demonstrates superior accuracy in segmenting organs such as the liver, stomach, aorta, and pancreas, thereby enhancing the accuracy of 3D abdominal multi-organ image segmentation.
引用
收藏
页码:106658 / 106671
页数:14
相关论文
共 50 条
  • [1] 3D FULLY CONVOLUTIONAL NETWORK FOR THORAX MULTI-ORGANS SEMANTIC SEGMENTATION
    Wu, Qian
    Chen, Qi
    Yu, Yongjian
    Fan, Liangjun
    JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2022, 22 (03)
  • [2] Multi-attention Mechanism for Enhanced Pseudo-3D Prostate Zonal Segmentation
    Krishnan, Chetana
    Onuoha, Ezinwanne
    Hung, Alex
    Sung, Kyung Hyun
    Kim, Harrison
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2025,
  • [3] A novel multi-attention, multi-scale 3D deep network for coronary artery segmentation
    Dong, Caixia
    Xu, Songhua
    Dai, Duwei
    Zhang, Yizhi
    Zhang, Chunyan
    Li, Zongfang
    MEDICAL IMAGE ANALYSIS, 2023, 85
  • [4] High-Precision Semi-supervised 3D Dental Segmentation Based on nnUNet
    Zhang, Bingyan
    Zhu, Xuefei
    SEMI-SUPERVISED TOOTH SEGMENTATION, SEMITOOTHSEG 2023, 2025, 14623 : 180 - 191
  • [5] 3D Object Detection with LiDAR Based on Multi-Attention Mechanism
    Cao, Jie
    Peng, Yiqiang
    Fan, Likang
    Mo, Lingfan
    Wang, Longfei
    LASER & OPTOELECTRONICS PROGRESS, 2025, 62 (04)
  • [6] Segmentation prompts classification: A nnUNet-based 3D transfer learning framework with ROI tokenization and cross-task attention for esophageal cancer T-stage diagnosis
    Li, Chen
    Wang, Runyuan
    He, Ping
    Chen, Wei
    Wu, Wei
    Wu, Yi
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [7] 3D Multi-Attention Guided Multi-Task Learning Network for Automatic Gastric Tumor Segmentation and Lymph Node Classification
    Zhang, Yongtao
    Li, Haimei
    Du, Jie
    Qin, Jing
    Wang, Tianfu
    Chen, Yue
    Liu, Bing
    Gao, Wenwen
    Ma, Guolin
    Lei, Baiying
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (06) : 1618 - 1631
  • [8] DiffCAS: diffusion based multi-attention network for segmentation of 3D coronary artery from CT angiography
    Li, Jiajia
    Wu, Qing
    Wang, Yuanquan
    Zhou, Shoujun
    Zhang, Lei
    Wei, Jin
    Zhao, Di
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (10) : 7487 - 7498
  • [9] DiffCAS: diffusion based multi-attention network for segmentation of 3D coronary artery from CT angiography
    Li, Jiajia
    Wu, Qing
    Wang, Yuanquan
    Zhou, Shoujun
    Zhang, Lei
    Wei, Jin
    Zhao, Di
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024,
  • [10] Human Action Recognition Based on 3D Convolution and Multi-Attention Transformer
    Liu, Minghua
    Li, Wenjing
    He, Bo
    Wang, Chuanxu
    Qu, Lianen
    APPLIED SCIENCES-BASEL, 2025, 15 (05):