PEMMA: Parameter-Efficient Multi-Modal Adaptation for Medical Image Segmentation

被引：0

作者：

Saadi, Nada ^{[1
]}

Saeed, Numan ^{[1
]}

Yaqub, Mohammad ^{[1
]}

Nandakumar, Karthik ^{[1
]}

机构：

[1] Mohamed Bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII | 2024年 / 15012卷

关键词：

Multi-modal Adaptation; Low-rank Adaptation; Parameter-Efficiency; Cross-modal Entanglement; 3D Medical Image Segmentation;

D O I：

10.1007/978-3-031-72390-2_25

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Imaging modalities such as Computed Tomography (CT) and Positron Emission Tomography (PET) are key in cancer detection, inspiring Deep Neural Networks (DNN) models that merge these scans for tumor segmentation. When both CT and PET scans are available, it is common to combine them as two channels of the input to the segmentation model. However, this method requires both scan types during training and inference, posing a challenge due to the limited availability of PET scans, thereby sometimes limiting the process to CT scans only. Hence, there is a need to develop a flexible DNN architecture that can be trained/updated using only CT scans but can effectively utilize PET scans when they become available. In this work, we propose a parameter-efficient multi-modal adaptation (PEMMA) framework for lightweight upgrading of a transformer-based segmentation model trained only on CT scans to also incorporate PET scans. The benefits of the proposed approach are two-fold. Firstly, we leverage the inherent modularity of the transformer architecture and perform low-rank adaptation (LoRA) of the attention weights to achieve parameter-efficient adaptation. Secondly, since the PEMMA framework attempts to minimize cross-modal entanglement, it is possible to subsequently update the combined model using only one modality, without causing catastrophic forgetting of the other modality. Our proposed method achieves comparable results with the performance of early fusion techniques with just 8% of the trainable parameters, especially with a remarkable +28% improvement on the average dice score on PET scans when trained on a single modality.

引用

页码：262 / 271

页数：10

共 50 条

[41] Trans-SAM: Transfer Segment Anything Model to medical image segmentation with Parameter-Efficient Fine-Tuning
Wu, Yanlin
Wang, Zhihong
Yang, Xiongfeng
Kang, Hong
He, Along
Li, Tao
KNOWLEDGE-BASED SYSTEMS, 2025, 310
[42] Multi-layer, multi-modal medical image intelligent fusion
Nair, Rekha R.
Singh, Tripty
Basavapattana, Abhinandan
Pawar, Manasa M.
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 42821 - 42847
[43] Multi-layer, multi-modal medical image intelligent fusion
Rekha R. Nair
Tripty Singh
Abhinandan Basavapattana
Manasa M. Pawar
Multimedia Tools and Applications, 2022, 81 : 42821 - 42847
[44] Learning Cross-Modal Deep Representations for Multi-Modal MR Image Segmentation
Li, Cheng
Sun, Hui
Liu, Zaiyi
Wang, Meiyun
Zheng, Hairong
Wang, Shanshan
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT II, 2019, 11765 : 57 - 65
[45] Dual-attention transformer-based hybrid network for multi-modal medical image segmentation
Zhang, Menghui
Zhang, Yuchen
Liu, Shuaibing
Han, Yahui
Cao, Honggang
Qiao, Bingbing
SCIENTIFIC REPORTS, 2024, 14 (01):
[46] Adversarial Cross-modal Domain Adaptation for Multi-modal Semantic Segmentation in Autonomous Driving
Shi, Mengqi
Cao, Haozhi
Xie, Lihua
Yang, Jianfei
2022 17TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2022, : 850 - 855
[47] Adaptive decomposition method for multi-modal medical image fusion
Wang, Jing
Li, Xiongfei
Zhang, Yan
Zhang, Xiaoli
IET IMAGE PROCESSING, 2018, 12 (08) : 1403 - 1412
[48] Multi-modal Medical Image Registration by Local Affine Transformations
Lo Presti, Liliana
La Cascia, Marco
PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 534 - 540
[49] Performance comparisons of multi-modal medical image registration algorithms
Chihoub, A
Bansal, R
Bani-Hashemi, A
2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2002, : 125 - 128
[50] Multi-modal medical image fusion using LMF-GAN - A maximum parameter infusion technique
Nair, Rekha R.
Singh, Tripty
Sankar, Rashmi
Gunndu, Klement
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (05) : 5375 - 5386

← 1 2 3 4 5 →