Fully automatic person segmentation in unconstrained video using spatio-temporal conditional random fields

被引：6

作者：

Bhole, Chetan ^{[1
]}

Pal, Christopher ^{[2
]}

机构：

[1] Univ Rochester, Rochester, NY 14620 USA

[2] Univ Montreal, Montreal, PQ, Canada

来源：

IMAGE AND VISION COMPUTING | 2016年 / 51卷

关键词：

Person segmentation; Video segmentation; Conditional random field; Optical flow; Fully automatic; POSE ESTIMATION;

D O I：

10.1016/j.imavis.2016.04.007

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The segmentation of objects and people in particular is an important problem in computer vision. In this paper, we focus on automatically segmenting a person from challenging video sequences in which we place no constraint on camera viewpoint, camera motion or the movements of a person in the scene. Our approach uses the most confident predictions from a pose detector as a form of anchor or keyframe stick figure prediction which helps guide the segmentation of other more challenging frames in the video. Since even state of the art pose detectors are unreliable on many frames especially given that we are interested in segmentations with no camera or motion constraints only the poses or stick figure predictions for frames with the highest confidence in a localized temporal region anchor further processing. The stick figure predictions within confident keyframes are used to extract color, position and optical flow features. Multiple conditional random fields (CRFs) are used to process blocks of video in batches, using a two dimensional CRF for detailed keyframe segmentation as well as 3D CRFs for propagating segmentations to the entire sequence of frames belonging to batches. Location information derived from the pose is also used to refine the results. Importantly, no hand labeled training data is required by our method. We discuss the use of a continuity method that reuses learnt parameters between batches of frames and show how pose predictions can also be improved by our model. We provide an extensive evaluation of our approach, comparing it with a variety of alternative grab cut based methods and a prior state of the art method. We also release our evaluation data to the community to facilitate further experiments. We find that our approach yields state of the art qualitative and quantitative performance compared to prior work and more heuristic alternative approaches. (C) 2016 Elsevier B.V. All rights reserved.

引用

页码：58 / 68

页数：11

共 50 条

[1] Deep Spatio-Temporal Random Fields for Efficient Video Segmentation
Chandra, Siddhartha
Couprie, Camille
Kokkinos, Iasonas
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8915 - 8924
[2] Automatic spatio-temporal video sequence segmentation
Vass, J
Palaniappan, K
Zhuang, XH
1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 1, 1998, : 958 - 962
[3] Video Segmentation by Spatio-temporal Random Walk
Chang, Jing
Wang, Hui
PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON E-BUSINESS, INFORMATION MANAGEMENT AND COMPUTER SCIENCE, 2018, : 54 - 58
[4] Spatio-Temporal Event Detection Using Dynamic Conditional Random Fields
Yin, Jie
Hu, Derek Hao
Yang, Qiang
21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1321 - 1326
[5] Video segmentation using spatio-temporal information
Kim, YW
Ho, YS
IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 785 - 788
[6] Segmentation of video sequences using spatial-temporal conditional random fields
Lei Zhang
Qiang Ji
2008 IEEE WORKSHOP ON MOTION AND VIDEO COMPUTING, 2008, : 15 - 21
[7] Semi-automatic lymphoma detection and segmentation using fully conditional random fields
Yu, Yuntao
Decazes, Pierre
Lapuyade-Lahorgue, Jerome
Gardin, Isabelle
Vera, Pierre
Ruan, Su
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2018, 70 : 1 - 7
[8] Spatio-temporal segmentation using laserscanner and video sequences
Kaempchen, N
Zocholl, M
Dietmayer, KCJ
PATTERN RECOGNITION, 2004, 3175 : 367 - 374
[9] Video Shot Segmentation Using Spatio-Temporal Fuzzy Hostility Index and Automatic Threshold
Bhaumik, Hrishikesh
Bhattacharyya, Siddhartha
Chakraborty, Susanta
2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 501 - 506
[10] A simple framework for spatio-temporal video segmentation and delayering using dense motion fields
Piroddi, R
Vlachos, T
IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (07) : 421 - 424

← 1 2 3 4 5 →