PEMMA: Parameter-Efficient Multi-Modal Adaptation for Medical Image Segmentation

被引:0
|
作者
Saadi, Nada [1 ]
Saeed, Numan [1 ]
Yaqub, Mohammad [1 ]
Nandakumar, Karthik [1 ]
机构
[1] Mohamed Bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
Multi-modal Adaptation; Low-rank Adaptation; Parameter-Efficiency; Cross-modal Entanglement; 3D Medical Image Segmentation;
D O I
10.1007/978-3-031-72390-2_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imaging modalities such as Computed Tomography (CT) and Positron Emission Tomography (PET) are key in cancer detection, inspiring Deep Neural Networks (DNN) models that merge these scans for tumor segmentation. When both CT and PET scans are available, it is common to combine them as two channels of the input to the segmentation model. However, this method requires both scan types during training and inference, posing a challenge due to the limited availability of PET scans, thereby sometimes limiting the process to CT scans only. Hence, there is a need to develop a flexible DNN architecture that can be trained/updated using only CT scans but can effectively utilize PET scans when they become available. In this work, we propose a parameter-efficient multi-modal adaptation (PEMMA) framework for lightweight upgrading of a transformer-based segmentation model trained only on CT scans to also incorporate PET scans. The benefits of the proposed approach are two-fold. Firstly, we leverage the inherent modularity of the transformer architecture and perform low-rank adaptation (LoRA) of the attention weights to achieve parameter-efficient adaptation. Secondly, since the PEMMA framework attempts to minimize cross-modal entanglement, it is possible to subsequently update the combined model using only one modality, without causing catastrophic forgetting of the other modality. Our proposed method achieves comparable results with the performance of early fusion techniques with just 8% of the trainable parameters, especially with a remarkable +28% improvement on the average dice score on PET scans when trained on a single modality.
引用
收藏
页码:262 / 271
页数:10
相关论文
共 50 条
  • [41] Trans-SAM: Transfer Segment Anything Model to medical image segmentation with Parameter-Efficient Fine-Tuning
    Wu, Yanlin
    Wang, Zhihong
    Yang, Xiongfeng
    Kang, Hong
    He, Along
    Li, Tao
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [42] Multi-layer, multi-modal medical image intelligent fusion
    Nair, Rekha R.
    Singh, Tripty
    Basavapattana, Abhinandan
    Pawar, Manasa M.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 42821 - 42847
  • [43] Multi-layer, multi-modal medical image intelligent fusion
    Rekha R. Nair
    Tripty Singh
    Abhinandan Basavapattana
    Manasa M. Pawar
    Multimedia Tools and Applications, 2022, 81 : 42821 - 42847
  • [44] Learning Cross-Modal Deep Representations for Multi-Modal MR Image Segmentation
    Li, Cheng
    Sun, Hui
    Liu, Zaiyi
    Wang, Meiyun
    Zheng, Hairong
    Wang, Shanshan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT II, 2019, 11765 : 57 - 65
  • [45] Dual-attention transformer-based hybrid network for multi-modal medical image segmentation
    Zhang, Menghui
    Zhang, Yuchen
    Liu, Shuaibing
    Han, Yahui
    Cao, Honggang
    Qiao, Bingbing
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [46] Adversarial Cross-modal Domain Adaptation for Multi-modal Semantic Segmentation in Autonomous Driving
    Shi, Mengqi
    Cao, Haozhi
    Xie, Lihua
    Yang, Jianfei
    2022 17TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2022, : 850 - 855
  • [47] Adaptive decomposition method for multi-modal medical image fusion
    Wang, Jing
    Li, Xiongfei
    Zhang, Yan
    Zhang, Xiaoli
    IET IMAGE PROCESSING, 2018, 12 (08) : 1403 - 1412
  • [48] Multi-modal Medical Image Registration by Local Affine Transformations
    Lo Presti, Liliana
    La Cascia, Marco
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 534 - 540
  • [49] Performance comparisons of multi-modal medical image registration algorithms
    Chihoub, A
    Bansal, R
    Bani-Hashemi, A
    2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2002, : 125 - 128
  • [50] Multi-modal medical image fusion using LMF-GAN - A maximum parameter infusion technique
    Nair, Rekha R.
    Singh, Tripty
    Sankar, Rashmi
    Gunndu, Klement
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (05) : 5375 - 5386