Self-supervised learning of monocular depth and ego-motion estimation for non-rigid scenes in wireless capsule endoscopy videos

被引:0
|
作者
Liao, Chao [1 ,2 ]
Wang, Chengliang [2 ]
Wang, Peng [2 ]
Wu, Hao [2 ]
Wang, Hongqian [2 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
[2] Army Med Univ, Southwest Hosp, Chongqing, Peoples R China
关键词
Wireless capsule endoscopy (WCE) images; Monocular depth estimation; Ego-motion estimation; Non-rigid scenes; Transformer; CANCER;
D O I
10.1016/j.bspc.2024.105978
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background and objective: Gastrointestinal (GI) cancers represent the most widespread type of cancer worldwide. Wireless capsule endoscopy (WCE), an innovative, capsule -sized endoscope, has the potential to revolutionize both the diagnosis and treatment of GI cancers as well as other GI diseases by offering patients a less invasive and more comfortable option. Nonetheless, WCE videos frequently display non -rigid transformations and brightness fluctuations, rendering prior simultaneous localization and mapping (SLAM) approaches unfeasible. The depth can assist in recognizing and monitoring potential obstructions or anomalies when localization is required. Methods: In this paper, we present a self -supervised model, SfMLearner-WCE, specifically designed for estimating depth and ego motion in WCE videos. Our approach incorporates a pose estimation network and a Transformer network with a global self -attention mechanism. To ensure high -quality depth and pose estimation, we propose learnable binary per -pixel masks to eliminate misaligned image regions arising from non -rigid transformations or significant changes in lighting. Additionally, we introduce multi -interval frame sampling to enhance training data diversity, coupled with long-term pose consistency regularization. Results: We present a comprehensive evaluation of the performance of SfMLearner-WCE in comparison with five state-of-the-art self -supervised SLAM methods. Our proposed approach is rigorously assessed on three WCE datasets. The experimental results demonstrate our approach achieves high -quality depth estimation and high -precision ego -motion estimation for non -rigid scenes in WCE videos, outperforming other self -supervised SLAM methods. In the quantitative evaluation of depth estimation using the ColonDepth dataset, an absolute relative error of 0.232 was observed. Additionally, during the quantitative assessment of ego -motion estimation on the ColonSim dataset, a translation drift percentage of 43.176% was achieved at a frame rate of 2 frames per second. Conclusions: The experimental analysis conducted in this study offers evidence of the effectiveness and robustness of our proposed method, SfMLearner-WCE, in non -rigid scenes of WCE videos. SfMLearner-WCE assists in enhancing diagnostic efficiency, enabling physicians to navigate and analyze WCE videos more effectively, benefiting patient outcomes. Our code will be released at https://github.com/fisherliaoc/SfMLearner-WCE.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Self-supervised Learning of Depth and Camera Motion from 360° Videos
    Wang, Fu-En
    Hu, Hou-Ning
    Cheng, Hsien-Tzu
    Lin, Juan-Ting
    Yang, Shang-Ta
    Shih, Meng-Li
    Chu, Hung-Kuo
    Sun, Min
    COMPUTER VISION - ACCV 2018, PT V, 2019, 11365 : 53 - 68
  • [42] Self-Supervised Monocular Depth Estimation From Videos via Adaptive Reconstruction Constraints
    Ye, Xinchen
    Ou, Yuxiang
    Wu, Biao
    Xu, Rui
    Li, Haojie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2161 - 2172
  • [43] Self-Supervised Learning of Monocular Depth Estimation Based on Progressive Strategy
    Wang, Huachun
    Sang, Xinzhu
    Chen, Duo
    Wang, Peng
    Yan, Binbin
    Qi, Shuai
    Ye, Xiaoqian
    Yao, Tong
    IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2021, 7 : 375 - 383
  • [44] Depth estimation algorithm of monocular image based on self-supervised learning
    Bai L.
    Liu L.-J.
    Li X.-A.
    Wu S.
    Liu R.-Q.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2023, 53 (04): : 1139 - 1145
  • [45] GasMono: Geometry-Aided Self-Supervised Monocular Depth Estimation for Indoor Scenes
    Zhao, Chaoqiang
    Poggi, Matteo
    Tosi, Fabio
    Zhou, Lei
    Sun, Qiyu
    Tang, Yang
    Mattoccia, Stefano
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16163 - 16174
  • [46] SENSE: Self-Evolving Learning for Self-Supervised Monocular Depth Estimation
    Li, Guanbin
    Huang, Ricong
    Li, Haofeng
    You, Zunzhi
    Chen, Weikai
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 439 - 450
  • [47] Self-supervised learning monocular depth estimation from internet photos
    Lin, Xiaocan
    Li, Nan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 99
  • [48] Depth Estimation for Colonoscopy Images with Self-supervised Learning from Videos
    Cheng, Kai
    Ma, Yiting
    Sun, Bin
    Li, Yang
    Chen, Xuejin
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VI, 2021, 12906 : 119 - 128
  • [49] Self-supervised learning-based diffeomorphic non-rigid motion estimation for fast motion-compensated coronary MR angiography
    Munoz, Camila
    Qi, Haikun
    Cruz, Gastao
    Kuestner, Thomas
    Botnar, Rene M.
    Prieto, Claudia
    MAGNETIC RESONANCE IMAGING, 2022, 85 : 10 - 18
  • [50] Self-Supervised Monocular Depth Estimation With Isometric-Self-Sample-Based Learning
    Cha, Geonho
    Jang, Ho-Deok
    Wee, Dongyoon
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 2173 - 2180