Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

被引:71
|
作者
Yang, Lihe [1 ]
Kang, Bingyi [2 ]
Huang, Zilong [2 ]
Xu, Xiaogang [3 ,4 ]
Feng, Jiashi [2 ]
Zhao, Hengshuang [1 ]
机构
[1] HKU, Hong Kong, Peoples R China
[2] Tiktok, Beijing 9, Peoples R China
[3] CUHK, Hong Kong, Peoples R China
[4] ZJU, Hangzhou, Peoples R China
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52733.2024.00987
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work presents Depth Anything(1), a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, we aim to build a simple yet powerful foundation model dealing with any images under any circumstances. To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (similar to 62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error. We investigate two simple yet effective strategies that make data scaling-up promising. First, a more challenging optimization target is created by leveraging data augmentation tools. It compels the model to actively seek extra visual knowledge and acquire robust representations. Second, an auxiliary supervision is developed to enforce the model to inherit rich semantic priors from pre-trained encoders. We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos. It demonstrates impressive generalization ability (Figure 1). Further, through fine-tuning it with metric depth information from NYUv2 and KITTI, new SOTAs are set. Our better depth model also results in a better depth-conditioned ControlNet. Our models are released here.
引用
收藏
页码:10371 / 10381
页数:11
相关论文
共 50 条
  • [21] The power of large-scale exome sequencing
    Linda Koch
    Nature Reviews Genetics, 2021, 22 : 549 - 549
  • [22] The power of large-scale exome sequencing
    Koch, Linda
    NATURE REVIEWS GENETICS, 2021, 22 (09) : 549 - 549
  • [23] Scalable Algorithms for Bayesian Inference of Large-Scale Models from Large-Scale Data
    Ghattas, Omar
    Isaac, Tobin
    Petra, Noemi
    Stadler, Georg
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 3 - 6
  • [24] Large-Scale Reasoning with (Semantic) Data
    Antoniou, Grigoris
    Batsakis, Sotiris
    Tachmazidis, Ilias
    4TH INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, MINING AND SEMANTICS, 2014,
  • [25] Large-scale inversion of ZTEM data
    Holtham, Elliot
    Oldenburg, Douglas W.
    GEOPHYSICS, 2012, 77 (04) : WB37 - WB45
  • [26] Data Provenance in Large-Scale Distribution
    Zhu, Yunan
    Che, Wei
    Shan, Chao
    Zhao, Shen
    ARTIFICIAL INTELLIGENCE AND SECURITY, ICAIS 2022, PT III, 2022, 13340 : 28 - 42
  • [27] An overview of a large-scale data migration
    Lübeck, M
    Geppert, D
    Nienartowicz, K
    Nowak, M
    Valassi, A
    20TH IEEE/11TH NASA GODDARD CONFERENCE ON MASS STORAGE AND TECHNOLOGIES (MSST 2003), PROCEEDINGS, 2003, : 49 - 55
  • [28] Unfolding large-scale marketing data
    Ho, Ying
    Chung, Yuho
    Lau, Kin-nam
    INTERNATIONAL JOURNAL OF RESEARCH IN MARKETING, 2010, 27 (02) : 119 - 132
  • [29] Large-Scale Visual Data Analysis
    Johnson, Chris
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, : 1 - 1
  • [30] Large-scale parallel data clustering
    Judd, D
    McKinley, PK
    Jain, AK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (08) : 871 - 876