FastClip: An Efficient Video Understanding System with Heterogeneous Computing and Coarse-to-fine Processing

被引：0

作者：

Zhao, Liming ^{[1
]}

Sun, Siyang ^{[1
]}

Zhang, Yanhao ^{[1
]}

Zheng, Yun ^{[1
]}

Pan, Pan ^{[1
]}

机构：

[1] Alibaba Grp, Hangzhou, Peoples R China

来源：

COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION | 2022年

关键词：

video understanding; heterogeneous computing; system speedup;

D O I：

10.1145/3487553.3524209

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, video medias are exponentially growing in many areas such as E-commerce shopping and gaming. Understanding the video contents is critical for real-world applications. However, processing long videos is usually time-consuming and expensive. In this paper, we present an efficient video understanding system, which aims to speed up the video processing with a coarse-to-fine two-stage pipeline and heterogeneous computing framework. First, we use a coarse but fast multi-modal filtering module to recognize and remove useless video segments from a long video, which could be deployed on an edge device and reduce computations for the next processing. Second, several semantic models are applied for finely parsing the remained sequences. To accelerate the model inference, we propose a novel heterogeneous computing framework, which trains a model with lightweight and heavyweight backbones to support a distributed deployment on a powerful device (e.g., cloud or GPU) and another different device (e.g., edge or CPU). In this way, the model could be both efficient and effective. The proposed system has been widely used in Alibaba, including "Taobao Live Analysis" and "Commodity Short-Video Generation", which could achieve a 10x speedup for the system.

引用

页码：67 / 71

页数：5

共 50 条

[31] Coarse-to-fine Semantic Video Segmentation using Supervoxel Trees
Jain, Aastha
Chatterjee, Shaunak
Vidal, Rene
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1865 - 1872
[32] Augmented Coarse-to-Fine Video Frame Synthesis with Semantic Loss
Jin, Xin
Chen, Zhibo
Liu, Sen
Zhou, Wei
PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT I, 2018, 11256 : 439 - 452
[33] Efficient coarse-to-fine spectral rectification for hyperspectral image
Xie, Weiying
Li, Yunsong
Zhou, Weiping
Zheng, Yuxuan
NEUROCOMPUTING, 2018, 275 : 2490 - 2504
[34] Time course of visual perception:: Coarse-to-fine processing and beyond
Hegde, Jay
PROGRESS IN NEUROBIOLOGY, 2008, 84 (04) : 405 - 439
[35] Stereoscopic depth processing in the visual cortex: a coarse-to-fine mechanism
Menz, MD
Freeman, RD
NATURE NEUROSCIENCE, 2003, 6 (01) : 59 - 65
[36] Stereoscopic depth processing in the visual cortex: a coarse-to-fine mechanism
Michael D. Menz
Ralph D. Freeman
Nature Neuroscience, 2003, 6 : 59 - 65
[37] Modeling the development of coarse-to-fine processing in the central visual pathway
Jasmine A Nirody
BMC Neuroscience, 14 (Suppl 1)
[38] A Coarse-to-Fine Approach to Computing the k-Best Viterbi Paths
Nielsen, Jesper
COMBINATORIAL PATTERN MATCHING, 22ND ANNUAL SYMPOSIUM, CPM 2011, 2011, 6661 : 376 - 387
[39] COARSE-TO-FINE STRATEGY FOR EFFICIENT COST-VOLUME FILTERING
Furuta, Ryosuke
Ikehata, Satoshi
Yamasaki, Toshihiko
Aizawa, Kiyoharu
2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 3793 - 3797
[40] Efficient Parallel Connected Component Labeling With a Coarse-to-Fine Strategy
Chen, Jun
Nonaka, Keisuke
Sankoh, Hiroshi
Watanabe, Ryosuke
Sabirin, Houari
Naito, Sei
IEEE ACCESS, 2018, 6 : 55731 - 55740

← 1 2 3 4 5 →