Streamlined photoacoustic image processing with foundation models: A training-free solution

被引:0
|
作者
Deng, Handi [1 ,2 ,3 ]
Zhou, Yucheng [4 ]
Xiang, Jiaxuan [5 ]
Gu, Liujie [1 ,2 ,3 ]
Luo, Yan [1 ]
Feng, Hai [6 ]
Liu, Mingyuan [6 ]
Ma, Cheng [1 ,2 ,3 ]
机构
[1] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Dept Elect Engn, 30 Shuangqing Rd, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Inst Precis Healthcare, 77 Shuangqing Rd, Beijing 100084, Peoples R China
[3] Tsinghua Univ, Inst Intelligent Healthcare, 77 Shuangqing Rd, Beijing 100084, Peoples R China
[4] Beihang Univ, Sch Biol Sci & Med Engn, 37 XueYuan Rd, Beijing 100191, Peoples R China
[5] TsingPAI Technol Co Ltd, 27 Jiancaicheng Middle Rd, Beijing 100096, Peoples R China
[6] Capital Med Univ, Beijing Friendship Hosp, Dept Vasc Surg, 95 Yongan Rd, Beijing 100050, Peoples R China
基金
中国国家自然科学基金;
关键词
Foundation models; photoacoustic imaging; image segmentation; large model; SEGMENTATION; TOMOGRAPHY; CANCER;
D O I
10.1142/S1793545824500196
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Foundation models (FMs) have rapidly evolved and have achieved significant accomplishments in computer vision tasks. Specifically, the prompt mechanism conveniently allows users to integrate image prior information into the model, making it possible to apply models without any training. Therefore, we proposed a workflow based on foundation models and zero training to solve the tasks of photoacoustic (PA) image processing. We employed the Segment Anything Model (SAM) by setting simple prompts and integrating the model's outputs with prior knowledge of the imaged objects to accomplish various tasks, including: (1) removing the skin signal in three-dimensional PA image rendering; (2) dual speed-of-sound reconstruction, and (3) segmentation of finger blood vessels. Through these demonstrations, we have concluded that FMs can be directly applied in PA imaging without the requirement for network design and training. This potentially allows for a hands-on, convenient approach to achieving efficient and accurate segmentation of PA images. This paper serves as a comprehensive tutorial, facilitating the mastery of the technique through the provision of code and sample datasets.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] CryoSAM: Training-Free CryoET Tomogram Segmentation with Foundation Models
    Zhao, Yizhou
    Bian, Hengwei
    Mu, Michael
    Uddin, Mostofa R.
    Li, Zhenyang
    Li, Xiang
    Wang, Tianyang
    Xu, Min
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VIII, 2024, 15008 : 124 - 134
  • [2] Towards Training-Free Open-World Segmentation via Image Prompt Foundation Models
    Tang, Lv
    Jiang, Peng-Tao
    Xiao, Haoke
    Li, Bo
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (01) : 1 - 15
  • [3] CartoonDiff: Training-free Cartoon Image Generation with Diffusion Transformer Models
    He, Feihong
    Li, Gang
    Si, Lingyu
    Yan, Leilei
    Hou, Shimeng
    Dong, Hongwei
    Li, Fanzhang
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3825 - 3829
  • [4] A training-free framework for valid object counting by cascading spatial and semantic understanding of foundation models
    Huang, Qinghong
    Zhang, Yifan
    Zhang, Wenbo
    Lin, Jianfeng
    Huang, Binqiang
    Zhang, Jinlu
    Yu, Wenhao
    Information Sciences, 2025, 712
  • [5] Training-Free Consistent Text-to-Image Generation
    Tewel, Yoad
    Kaduri, Omri
    Gal, Rinon
    Kasten, Yoni
    Wolf, Lior
    Chechik, Gal
    Atzmon, Yuval
    ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):
  • [6] TtfDiffusion: Training-free and text-free image editing in diffusion models with structural and semantic disentanglement
    Yu, Zhenbo
    Jin, Jian
    Zhao, Jinhan
    Fu, Zhenyong
    Yang, Jian
    NEUROCOMPUTING, 2025, 619
  • [7] Training-Free Diffusion Models for Content-Style Synthesis
    Xu, Ruipeng
    Shen, Fei
    Xie, Xu
    Li, Zongyi
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT X, ICIC 2024, 2024, 14871 : 308 - 319
  • [8] Towards Training-Free Appearance-Based Localization: Probabilistic Models for Whole-Image Descriptors
    Lowry, Stephanie M.
    Wyeth, Gordon F.
    Milford, Michael J.
    2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 711 - 717
  • [9] Training-Free NAS for 3D Point Cloud Processing
    Zhao, Ping
    Chen, Panyue
    Liu, Guanming
    COMPUTER VISION - ACCV 2022, PT I, 2023, 13841 : 296 - 310
  • [10] FreeZe: Training-Free Zero-Shot 6D Pose Estimation with Geometric and Vision Foundation Models
    Caraffa, Andrea
    Boscaini, Davide
    Hamza, Amir
    Poiesi, Fabio
    COMPUTER VISION - ECCV 2024, PT LXXV, 2025, 15133 : 414 - 431