Multi-stream CNN based Video Semantic Segmentation for Automated Driving

被引：3

作者：

Sistu, Ganesh ^{[1
]}

Chennupati, Sumanth ^{[2
]}

Yogamani, Senthil ^{[1
]}

机构：

[1] Valeo Vis Syst, Dublin, Ireland

[2] Valeo Troy, Troy, NY USA

来源：

PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5 | 2019年

关键词：

Semantic Segmentation; Visual Perception; Automated Driving;

D O I：

10.5220/0007248401730180

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Majority of semantic segmentation algorithms operate on a single frame even in the case of videos. In this work, the goal is to exploit temporal information within the algorithm model for leveraging motion cues and temporal consistency. We propose two simple high-level architectures based on Recurrent FCN (RFCN) and Multi-Stream FCN (MSFCN) networks. In case of RFCN, a recurrent network namely LSTM is inserted between the encoder and decoder. MSFCN combines the encoders of different frames into a fused encoder via 1x1 channel-wise convolution. We use a ResNet50 network as the baseline encoder and construct three networks namely MSFCN of order 2 & 3 and RFCN of order 2. MSFCN-3 produces the best results with an accuracy improvement of 9% and 15% for Highway and New York-like city scenarios in the SYNTHIA-CVPR'16 dataset using mean IoU metric. MSFCN-3 also produced 11% and 6% for SegTrack V2 and DAVIS datasets over the baseline FCN network. We also designed an efficient version of MSFCN-2 and RFCN-2 using weight sharing among the two encoders. The efficient MSFCN-2 provided an improvement of 11% and 5% for KITTI and SYNTHIA with negligible increase in computational complexity compared to the baseline version.

引用

页码：173 / 180

页数：8

共 50 条

[1] Multi-stream densely connected network for semantic segmentation
Jia, Dayu
Cao, Jiale
Pan, Jing
Pang, Yanwei
IET COMPUTER VISION, 2022, 16 (02) : 180 - 191
[2] AMS-CNN: Attentive multi-stream CNN for video-based crowd counting
Tripathy, Santosh Kumar
Srivastava, Rajeev
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2021, 10 (04) : 239 - 254
[3] AMS-CNN: Attentive multi-stream CNN for video-based crowd counting
Santosh Kumar Tripathy
Rajeev Srivastava
International Journal of Multimedia Information Retrieval, 2021, 10 : 239 - 254
[4] Multi-stream segmentation of meetings
Dielmann, A
Renals, S
2004 IEEE 6TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2004, : 167 - 170
[5] Automated Video Monitor Screen Extraction using Semantic Segmentation and CNN
Belem, Ruan
Cruz, Caio
Pimentel, Agemilson
de Lima Filho, Eddie
Coimbra, Lucas
Jesus, Anderson
Costa, Andre
Silva, Osmar
Junior, Wilson
Paula, Ricardo
2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
[6] Driving behaviour recognition from still images by using multi-stream fusion CNN
Yaocong Hu
Mingqi Lu
Xiaobo Lu
Machine Vision and Applications, 2019, 30 : 851 - 865
[7] Driving behaviour recognition from still images by using multi-stream fusion CNN
Hu, Yaocong
Lu, Mingqi
Lu, Xiaobo
MACHINE VISION AND APPLICATIONS, 2019, 30 (05) : 851 - 865
[8] Multi-stream dynamic video Summarization
Elfeki, Mohamed
Wang, Liqiang
Borji, Ali
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 185 - 195
[9] A Multi-Stream Approach for Video Understanding
Kunam, Lutharsanen
Rossetto, Luca
Bernstein, Abraham
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7003 - 7007
[10] A multi-stream CNN for deep violence detection in video sequences using handcrafted features
Mohtavipour, Seyed Mehdi
Saeidi, Mahmoud
Arabsorkhi, Abouzar
VISUAL COMPUTER, 2022, 38 (06): : 2057 - 2072

← 1 2 3 4 5 →