Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy

被引:0
|
作者
Solano, Pedro Esteban Chavarrias [1 ]
Bulpitt, Andrew [1 ]
Subramanian, Venkataraman [2 ,3 ]
Ali, Sharib [1 ]
机构
[1] Univ Leeds, Fac Engn & Phys Sci, Sch Comp Sci, Leeds LS2 9JT, England
[2] Leeds Teaching Hosp NHS Trust, Dept Gastroenterol, Leeds, England
[3] St Jamess Univ Leeds, Leeds Inst Med Res, Div Gastroenterol & Surg Sci, Leeds, England
基金
英国工程与自然科学研究理事会;
关键词
Deep learning; Monocular depth estimation; Surface normal prediction; Multi-task learning; Cross-task consistency; 3D colonoscopy; QUANTIFICATION; SURFACE; MOTION;
D O I
10.1016/j.media.2024.103379
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera (aka depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low- texture regions that are not well pronounced, learning representations from an auxiliary task can improve salient feature extraction, allowing estimation of accurate camera depths. In this work, we propose to develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator decoder. Our depth estimator incorporates attention mechanisms to enhance global context awareness. We leverage the surface normal prediction to improve geometric feature extraction. Also, we apply a cross-task consistency loss among the two geometrically related tasks, surface normal and camera depth. We demonstrate an improvement of 15.75% on relative error and 10.7% improvement on delta(1.25) accuracy over the most accurate baseline state-of-the-art Big-to-Small (BTS) approach. All experiments are conducted on a recently released C3VD dataset, and thus, we provide a first benchmark of state-of-the-art methods on this dataset.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] A Multi-task Learning Framework for Quality Estimation
    Deoghare, Sourabh
    Choudhary, Paramveer
    Kanojia, Diptesh
    Ranasinghe, Tharindu
    Bhattacharyya, Pushpak
    Orasan, Constantin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9191 - 9205
  • [22] Multi-Task Learning for Influence Estimation and Maximization
    Panagopoulos, George
    Malliaros, Fragkiskos D.
    Vazirgiannis, Michalis
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (09) : 4398 - 4409
  • [23] CROSS-TASK CONSISTENCY IN STRATEGY USE AND THE RELATIONSHIP WITH INTELLIGENCE
    ALDERTON, DL
    LARSON, GE
    INTELLIGENCE, 1994, 18 (01) : 47 - 76
  • [24] Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder
    Haurum, Joakim Bruslund
    Madadi, Meysam
    Escalera, Sergio
    Moeslund, Thomas B.
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1441 - 1452
  • [25] Cross-Domain Multi-task Learning for Object Detection and Saliency Estimation
    Khattar, Apoorv
    Hegde, Srinidhi
    Hebbalaguppe, Ramya
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3634 - 3643
  • [26] Multi-task Forest for Human Pose Estimation in Depth Images
    Lallemand, Joe
    Pauly, Olivier
    Schwarz, Loren
    Tan, David
    Ilic, Slobodan
    2013 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2013), 2013, : 271 - 278
  • [27] Joint Learning of Image Deblurring and Depth Estimation Through Adversarial Multi-Task Network
    Hou, Shengyu
    Fu, Mengyin
    Song, Wenjie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7327 - 7341
  • [28] Learning Sparse Task Relations in Multi-Task Learning
    Zhang, Yu
    Yang, Qiang
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2914 - 2920
  • [29] Task Variance Regularized Multi-Task Learning
    Mao, Yuren
    Wang, Zekai
    Liu, Weiwei
    Lin, Xuemin
    Hu, Wenbin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8615 - 8629
  • [30] Task Switching Network for Multi-task Learning
    Sun, Guolei
    Probst, Thomas
    Paudel, Danda Pani
    Popovic, Nikola
    Kanakis, Menelaos
    Patel, Jagruti
    Dai, Dengxin
    Van Gool, Luc
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8271 - 8280