MulA-nnUNet: A Multi-Attention Enhanced nnUNet Framework for 3D Abdominal Multi-Organs Segmentation

被引：0

作者：

Ding, Jiashuo ^{[1
]}

Ni, Wei ^{[1
]}

Wan, Jiahui ^{[2
]}

Deng, Xiaojun ^{[1
]}

Wan, Lanjun ^{[1
]}

机构：

[1] Hunan Univ Technol, Sch Comp Sci, Zhuzhou 412007, Peoples R China

[2] Hunan Agr Univ, Coll Mech & Elect Engn, Changsha 410125, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Image segmentation; Three-dimensional displays; Semantics; Decoding; Accuracy; Attention mechanisms; Tumors; Abdominal multi-organ image segmentation; attention mechanism; deep learning; nnUNet;

D O I：

10.1109/ACCESS.2024.3437652

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the domain of medical image segmentation, the nnUNet framework is highly respected for its excellent performance and wide range of applications. However, the inherent bias of locality and weight sharing introduced by the continuous convolutional operations currently used limits the network's performance in modeling long-term dependencies. Furthermore, in the process of implementing residual links, certain limitations are encountered due to the substantial semantic discrepancy between the encoder's output feature maps and the decoder's. These limitations are seen in the direct application of skip connections for feature fusion and gradient propagation, which are known to impact the model's convergence speed and overall performance. In this paper, a novel framework is presented, namely Multi-Attention nnUNet (MulA-nnUNet), which utilizes nnUNet as the foundational network structure and integrates two key attention mechanisms: large kernel convolutional attention (LKA) and pixel attention (PA). LKA is embedded within the deep encoder, maintaining the effectiveness of shallow feature extraction and enhancing the deep neural networks' ability to understand long-range spatial dependencies. At the same time, the semantic distinction between the encoder and decoder's output map of features is decreased by the PA module, which helps to improve the effect of skip connection feature fusion. The complexity of the model is reduced by replacing the standard convolutions in the encoder and decoder layers with depthwise separable convolutions (DS), which have fewer parameters. The effectiveness of the proposed framework is confirmed by a set of ablation experiments and comparison experiments with current state-of-the-art models on the computed tomography (CT) subset of the multimodal abdominal multi-organ segmentation dataset (AMOS), which includes 500 CT scans, with 350 scans for training, 75 for validation, and 75 for testing. MulA-nnUNet shows improvements of 1.1% in mean dice similarity coefficient (mDSC) and 1.52% in mean intersection over union (mIoU), while the baseline model requires 5 times the floating point operations (FLOPs) and over 7 times the parameters (Params). Additionally, it demonstrates superior accuracy in segmenting organs such as the liver, stomach, aorta, and pancreas, thereby enhancing the accuracy of 3D abdominal multi-organ image segmentation.

引用

页码：106658 / 106671

页数：14

共 50 条

[1] 3D FULLY CONVOLUTIONAL NETWORK FOR THORAX MULTI-ORGANS SEMANTIC SEGMENTATION
Wu, Qian
Chen, Qi
Yu, Yongjian
Fan, Liangjun
JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2022, 22 (03)
[2] Multi-attention Mechanism for Enhanced Pseudo-3D Prostate Zonal Segmentation
Krishnan, Chetana
Onuoha, Ezinwanne
Hung, Alex
Sung, Kyung Hyun
Kim, Harrison
JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2025,
[3] A novel multi-attention, multi-scale 3D deep network for coronary artery segmentation
Dong, Caixia
Xu, Songhua
Dai, Duwei
Zhang, Yizhi
Zhang, Chunyan
Li, Zongfang
MEDICAL IMAGE ANALYSIS, 2023, 85
[4] High-Precision Semi-supervised 3D Dental Segmentation Based on nnUNet
Zhang, Bingyan
Zhu, Xuefei
SEMI-SUPERVISED TOOTH SEGMENTATION, SEMITOOTHSEG 2023, 2025, 14623 : 180 - 191
[5] 3D Object Detection with LiDAR Based on Multi-Attention Mechanism
Cao, Jie
Peng, Yiqiang
Fan, Likang
Mo, Lingfan
Wang, Longfei
LASER & OPTOELECTRONICS PROGRESS, 2025, 62 (04)
[6] Segmentation prompts classification: A nnUNet-based 3D transfer learning framework with ROI tokenization and cross-task attention for esophageal cancer T-stage diagnosis
Li, Chen
Wang, Runyuan
He, Ping
Chen, Wei
Wu, Wei
Wu, Yi
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
[7] 3D Multi-Attention Guided Multi-Task Learning Network for Automatic Gastric Tumor Segmentation and Lymph Node Classification
Zhang, Yongtao
Li, Haimei
Du, Jie
Qin, Jing
Wang, Tianfu
Chen, Yue
Liu, Bing
Gao, Wenwen
Ma, Guolin
Lei, Baiying
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (06) : 1618 - 1631
[8] DiffCAS: diffusion based multi-attention network for segmentation of 3D coronary artery from CT angiography
Li, Jiajia
Wu, Qing
Wang, Yuanquan
Zhou, Shoujun
Zhang, Lei
Wei, Jin
Zhao, Di
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (10) : 7487 - 7498
[9] DiffCAS: diffusion based multi-attention network for segmentation of 3D coronary artery from CT angiography
Li, Jiajia
Wu, Qing
Wang, Yuanquan
Zhou, Shoujun
Zhang, Lei
Wei, Jin
Zhao, Di
SIGNAL IMAGE AND VIDEO PROCESSING, 2024,
[10] Human Action Recognition Based on 3D Convolution and Multi-Attention Transformer
Liu, Minghua
Li, Wenjing
He, Bo
Wang, Chuanxu
Qu, Lianen
APPLIED SCIENCES-BASEL, 2025, 15 (05):

← 1 2 3 4 5 →