Describing the Scene as a Whole: Joint Object Detection, Scene Classification and Semantic Segmentation

被引:0
|
作者
Yao, Jian [1 ]
Fidler, Sanja
Urtasun, Raquel [1 ]
机构
[1] TTI Chicago, Chicago, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we propose an approach to holistic scene understanding that reasons jointly about regions, location, class and spatial extent of objects, presence of a class in the image, as well as the scene type. Learning and inference in our model are efficient as we reason at the segment level, and introduce auxiliary variables that allow us to decompose the inherent high-order potentials into pairwise potentials between a few variables with small number of states (at most the number of classes). Inference is done via a convergent message-passing algorithm, which, unlike graph-cuts inference, has no submodularity restrictions and does not require potential specific moves. We believe this is very important, as it allows us to encode our ideas and prior knowledge about the problem without the need to change the inference engine every time we introduce a new potential. Our approach outperforms the state-of-the-art on the MSRC-21 benchmark, while being much faster. Importantly, our holistic model is able to improve performance in all tasks.
引用
收藏
页码:702 / 709
页数:8
相关论文
共 50 条
  • [11] Joint scene classification and segmentation based on hidden Markov model
    Huang, JC
    Liu, Z
    Wang, Y
    IEEE TRANSACTIONS ON MULTIMEDIA, 2005, 7 (03) : 538 - 550
  • [12] An Algorithm for Scene Text Detection Using Multibox and Semantic Segmentation
    Qin, Hongbo
    Zhang, Haodi
    Wang, Hai
    Yan, Yujin
    Zhang, Min
    Zhao, Wei
    APPLIED SCIENCES-BASEL, 2019, 9 (06):
  • [13] Traffic scene perception algorithm with joint semantic segmentation and depth estimation
    Fan K.
    Zhong M.
    Tan J.
    Zhan Z.
    Feng Y.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (04): : 684 - 695
  • [14] Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation
    Saurabh Gupta
    Pablo Arbeláez
    Ross Girshick
    Jitendra Malik
    International Journal of Computer Vision, 2015, 112 : 133 - 149
  • [15] Fast Semantic Segmentation for Scene Perception
    Zhang, Xuetao
    Chen, Zhenxue
    Wu, Q. M. Jonathan
    Cai, Lei
    Lu, Dan
    Li, Xianming
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (02) : 1183 - 1192
  • [16] Movie scene segmentation using object detection and set theory
    Ul Haq, Ijaz
    Muhammad, Khan
    Hussain, Tanveer
    Kwon, Soonil
    Sodanil, Maleerat
    Baik, Sung Wook
    Lee, Mi Young
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2019, 15 (06)
  • [17] Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation
    Gupta, Saurabh
    Arbelaez, Pablo
    Girshick, Ross
    Malik, Jitendra
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 112 (02) : 133 - 149
  • [18] Indoor Scene Segmentation with Semantic Cuboids
    Fang, Zhuoqun
    Wu, Chengdong
    Jia, Tong
    2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2015, : 2545 - 2550
  • [19] Semantic video scene segmentation and transfer
    Gritti, Tommaso
    Damkat, Chris
    Monaci, Gianluca
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 122 : 172 - 181
  • [20] Whole Spine Segmentation Using Object Detection and Semantic Segmentation
    Da Mutten, Raffaele
    Zanier, Olivier
    Theiler, Sven
    Ryu, Seung-Jun
    Regli, Luca
    Serra, Carlo
    Staartjes, Victor E.
    NEUROSPINE, 2024, 21 (01) : 57 - 67