Streamlined photoacoustic image processing with foundation models: A training-free solution

被引:0
|
作者
Deng, Handi [1 ,2 ,3 ]
Zhou, Yucheng [4 ]
Xiang, Jiaxuan [5 ]
Gu, Liujie [1 ,2 ,3 ]
Luo, Yan [1 ]
Feng, Hai [6 ]
Liu, Mingyuan [6 ]
Ma, Cheng [1 ,2 ,3 ]
机构
[1] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Dept Elect Engn, 30 Shuangqing Rd, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Inst Precis Healthcare, 77 Shuangqing Rd, Beijing 100084, Peoples R China
[3] Tsinghua Univ, Inst Intelligent Healthcare, 77 Shuangqing Rd, Beijing 100084, Peoples R China
[4] Beihang Univ, Sch Biol Sci & Med Engn, 37 XueYuan Rd, Beijing 100191, Peoples R China
[5] TsingPAI Technol Co Ltd, 27 Jiancaicheng Middle Rd, Beijing 100096, Peoples R China
[6] Capital Med Univ, Beijing Friendship Hosp, Dept Vasc Surg, 95 Yongan Rd, Beijing 100050, Peoples R China
基金
中国国家自然科学基金;
关键词
Foundation models; photoacoustic imaging; image segmentation; large model; SEGMENTATION; TOMOGRAPHY; CANCER;
D O I
10.1142/S1793545824500196
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Foundation models (FMs) have rapidly evolved and have achieved significant accomplishments in computer vision tasks. Specifically, the prompt mechanism conveniently allows users to integrate image prior information into the model, making it possible to apply models without any training. Therefore, we proposed a workflow based on foundation models and zero training to solve the tasks of photoacoustic (PA) image processing. We employed the Segment Anything Model (SAM) by setting simple prompts and integrating the model's outputs with prior knowledge of the imaged objects to accomplish various tasks, including: (1) removing the skin signal in three-dimensional PA image rendering; (2) dual speed-of-sound reconstruction, and (3) segmentation of finger blood vessels. Through these demonstrations, we have concluded that FMs can be directly applied in PA imaging without the requirement for network design and training. This potentially allows for a hands-on, convenient approach to achieving efficient and accurate segmentation of PA images. This paper serves as a comprehensive tutorial, facilitating the mastery of the technique through the provision of code and sample datasets.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] RECON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories
    Lu, Chen-Yi
    Agarwal, Shubham
    Tanjim, Md Mehrab
    Mahadik, Kanak
    Rao, Anup
    Mitra, Subrata
    Saini, Shiv Kumar
    Bagchi, Saurabh
    Chaterji, Somali
    COMPUTER VISION - ECCV 2024, PT LIX, 2025, 15117 : 288 - 306
  • [32] Training-Free, Single-Image Super-Resolution Using a Dynamic Convolutional Network
    Bhowmik, Aritra
    Shit, Suprosanna
    Seelamantula, Chandra Sekhar
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (01) : 85 - 89
  • [33] FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
    Mo, Sicheng
    Mu, Fangzhou
    Lin, Kuan Heng
    Liu, Yanli
    Guan, Bochen
    Li, Yin
    Zhou, Bolei
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 7465 - 7475
  • [34] TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
    Lu, Shilin
    Liu, Yanzhu
    Kong, Adams Wai-Kin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2294 - 2305
  • [35] Training-Free Parameter Extraction for Compact Device Models Using Sequential Bayesian Optimization With Adaptive Sampling
    Maheshwari, Om
    Singh, Aishwarya
    Mohapatra, Nihar R.
    IEEE TRANSACTIONS ON ELECTRON DEVICES, 2024, 71 (12) : 7889 - 7895
  • [36] Training-Free Condition Video Diffusion Models for Single Frame Spatial-Semantic Echocardiogram Synthesis
    Van Phi Nguyen
    Tri Nhan Luong Ha
    Huy Hieu Pham
    Quoc Long Tran
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VI, 2024, 15006 : 670 - 680
  • [37] Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models
    Pianese, Alessandro
    Poggi, Giovanni
    Cozzolino, Davide
    Verdoliva, Luisa
    PROCEEDINGS OF THE 2024 ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, IH&MMSEC 2024, 2024, : 289 - 294
  • [38] Training-Free Video Temporal Grounding Using Large-Scale Pre-trained Models
    Zheng, Minghang
    Cai, Xinhao
    Chen, Qingchao
    Peng, Yuxin
    Liu, Yang
    COMPUTER VISION-ECCV 2024, PT LXXXII, 2025, 15140 : 20 - 37
  • [39] DINO-Reg: General Purpose Image Encoder for Training-Free Multi-modal Deformable Medical Image Registration
    Song, Xinrui
    Xu, Xuanang
    Yan, Pingkun
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT II, 2024, 15002 : 608 - 617
  • [40] SuS-X: Training-Free Name-Only Transfer of Vision-Language Models
    Udandarao, Vishaal
    Gupta, Ankush
    Albanie, Samuel
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2725 - 2736