JSSE: Joint Sequential Semantic Encoder for Zero-Shot Event Recognition

被引：1

作者：

Madapana N. ^{[1
]}

Wachs J.P. ^{[1
]}

机构：

[1] Purdue University, School of Industrial Engineering, West Lafayette, 47906, IN

来源：

IEEE Transactions on Artificial Intelligence | 2023年 / 4卷 / 06期

基金：

美国国家科学基金会; 美国医疗保健研究与质量局; 美国国家卫生研究院;

关键词：

Action and gesture recognition; activity; semantic descriptors; transfer learning; zero-shot learning (ZSL);

D O I：

10.1109/TAI.2022.3208860

中图分类号：

学科分类号：

摘要：

Zero-shot learning (ZSL) is a paradigm in transfer learning that aims to recognize unknown categories by having a mere description of them. The problem of ZSL has been thoroughly studied in the domain of static object recognition; however, ZSL for dynamic events (zero-shot event recognition, ZSER) such as activities and gestures has hardly been investigated. In this context, this article addresses ZSER by relying on semantic attributes of events to transfer the learned knowledge from seen classes to unseen ones. First, we utilized the Amazon Mechanical Turk platform to create the first attribute-based gesture dataset, referred to as zero shot gestural learning (ZSGL), comprising the categories present in MSRC and Italian gesture datasets. Overall, our ZSGL dataset consisted of 26 categories, 65 discriminative attributes, and 16 attribute annotations and 400 examples per category. We used trainable recurrent networks and 3-D convolutional neural networks (CNNs) to learn the spatiotemporal features. Next, we propose a simple yet effective end-to-end approach for ZSER, referred to as joint sequential semantic encoder (JSSE), to explore temporal patterns, to efficiently represent events in the latent space, and to simultaneously optimize for both the semantic and classification tasks. We evaluate our model on ZSGL and two action datasets (UCF and HMDB), and compared the performance of JSSE against several existing baselines under four experimental conditions: 1) within-category, 2) across-category, 3) closed-set, and 4) open-set. Results show that JSSE considerably outperforms (p< 0.05) other approaches and performs favorably for both the datasets under all experimental conditions. © 2020 IEEE.

引用

页码：1472 / 1483

页数：11

共 50 条

[31] Learning adversarial semantic embeddings for zero-shot recognition in open worlds
Li, Tianqi
Pang, Guansong
Bai, Xiao
Zheng, Jin
Zhou, Lei
Ning, Xin
PATTERN RECOGNITION, 2024, 149
[32] Semantic matters: A constrained approach for zero-shot video action recognition
Quan, Zhenzhen
Chen, Jialei
Deguchi, Daisuke
Sun, Jie
Zhang, Chenkai
Li, Yujun
Murase, Hiroshi
PATTERN RECOGNITION, 2025, 162
[33] Semantic-aware visual attributes learning for zero-shot recognition
Xie, Yurui
Song, Tiecheng
Li, Wei
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 74 (74)
[34] Semantic Autoencoder for Zero-Shot Learning
Kodirov, Elyor
Xiang, Tao
Gong, Shaogang
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4447 - 4456
[35] Semantic-aware visual attributes learning for zero-shot recognition
Xie, Yurui
Song, Tiecheng
Li, Wei
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 74
[36] Semantic-aware visual attributes learning for zero-shot recognition
Xie, Yurui
Song, Tiecheng
Li, Wei
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 74
[37] Semantic-aware visual attributes learning for zero-shot recognition
Xie, Yurui
Song, Tiecheng
Li, Wei
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 74
[38] Zero-Shot Semantic Parsing for Instructions
Givoli, Ofer
Reichart, Roi
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4454 - 4464
[39] Semantic Concept Discovery for Large-Scale Zero-Shot Event Detection
Chang, Xiaojun
Yang, Yi
Hauptmann, Alexander G.
Xing, Eric P.
Yu, Yao-Liang
PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 2234 - 2240
[40] Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts
Shafin Rahman
Salman H. Khan
Fatih Porikli
International Journal of Computer Vision, 2020, 128 : 2979 - 2999

← 1 2 3 4 5 →