A Coarse-to-Fine Estimation Method for Spatial Layout of Indoor Scenes

被引：0

作者：

Liu T. ^{[1
]}

Gu Y. ^{[1
]}

Cao D. ^{[1
]}

Dai X. ^{[1
]}

Luo J. ^{[2
]}

机构：

[1] Jiangsu Provincial Key Lab of Image Processing and Image Communication, Nanjing University of Posts and Telecommunications, Nanjing

[2] Department of Computer Science, University of Rochester, Rochester

来源：

Jiqiren/Robot | 2019年 / 41卷 / 01期

关键词：

Convolution neural network; Indoor scene; Layout estimation; Scene layout category;

D O I：

10.13973/j.cnki.robot.180017

中图分类号：

学科分类号：

摘要：

A coarse-to-fine estimation method for spatial layout is presented to effectively label the layout relationship of indoor scenes. Firstly, the adaptive threshold detection method with local discontinues is exploited to acquire the long straight lines of the given scene, which are splitted into the vertical lines and horizontal ones in terms of the corresponding directions. The vertical and horizontal vanishing points are estimated based on the vote mechanism and orthogonality principle, and the pairs of the rays led from two vanishing points at equal angular interval are used to generate the candidates of the given scene layout. Next, the informative edge and geometric context of the given scene are estimated with VGG-16 full convolution neural network, and the softmax classifier is applied to deciding the given fc7 features to obtain the layout category, while the global features merged with the informative edge and layout category are generated to roughly select the layout candidates. Then, the normal vector and depth map of the given scenes are estimated with the VGG-based spatial multi-scale convolution neural network to extract the related normal vector and geometric depth feature. And next, the 3D box spatial layout model can be parameterized by the angles between the rays from vanishing points, while the line membership, geometric context, normal vector and depth feature are accumulated via geometric integral image to extract the regional features of layout candidates, and the structural model parameter can be learned with cutting-plane method. Finally, the layout candidate with the highest structural prediction score is selected as the final spatial layout. Experimental results on the Hedau and LSUN datasets demonstrate that the presented method can obtain more accurate number of divided polygons and more precise boundary positions of spatial layout. © 2019, Science Press. All right reserved.

引用

页码：58 / 64

页数：6

共 19 条

[1] Yao T.Z., Zuo W.H., Song J.T., Et al., Estimating spatial layout of cluttered rooms by using object prior and spatial constraints, Acta Automatica Sinica, 43, 8, pp. 1402-1411, (2017)
[2] Zhuang Y., Lu X.B., Li Y.H., Mobile robot indoor scene cognition using 3D laser scanning, Acta Automatica Sinica, 37, 10, pp. 1232-1240, (2011)
[3] Hedau V., Hoiem D., Forsyth D., Recovering the spatial layout of cluttered rooms, IEEE International Conference on Computer Vision, pp. 1849-1856, (2009)
[4] Hoiem D., Efros A.A., Hebert M., Geometric context from a single image, IEEE International Conference on Computer Vision, pp. 654-661, (2005)
[5] Lee D.C., Hebert M., Kanade T., Geometric reasoning for single image structure recovery, IEEE Conference on Computer Vision and Pattern Recognition, pp. 2136-2143, (2009)
[6] Ramalingam S., Pillai J.K., Jain A., Et al., Manhattan junction catalogue for spatial reasoning of indoor scenes, IEEE Conference on Computer Vision and Pattern Recognition, pp. 3065-3072, (2013)
[7] Zhang J., Kan C., Schwing A.G., Et al., Estimating the 3D layout of indoor scenes and its clutter from depth sensors, IEEE International Conference on Computer Vision, pp. 1273-1280, (2013)
[8] Wang H., Gould S., Roller D., Discriminative learning with latent variables for cluttered indoor scene understanding, Communications of the ACM, 56, 4, pp. 92-99, (2010)
[9] Schwing A.G., Hazan T., Pollefeys M., Et al., Efficient structured prediction for 3D indoor scene understanding, IEEE Conference on Computer Vision and Pattern Recognition, pp. 2815-2822, (2012)
[10] Mallya A., Lazebnik S., Learning informative edge maps for indoor scene layout prediction, IEEE International Conference on Computer Vision, pp. 936-944, (2015)

← 1 2 →