Audiovisual Event Detection Towards Scene Understanding

被引:0
|
作者
Canton-Ferrer, C. [1 ]
Butko, T. [1 ]
Segura, C. [1 ]
Giro, X. [1 ]
Nadeu, C. [1 ]
Hernando, J. [1 ]
Casas, J. R. [1 ]
机构
[1] Tech Univ Catalonia, Barcelona, Spain
来源
2009 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPR WORKSHOPS 2009), VOLS 1 AND 2 | 2009年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Acoustic events produced in meeting environments may contain useful information for perceptually aware interfaces and multimodal behavior analysis. In this paper a system to detect and recognize these events from a multimodal perspective is presented combining information from multiple cameras and microphones. First, spectral and temporal features are extracted from a single audio channel and spatial localization is achieved by exploiting cross-correlation among microphone arrays. Second, several video cues obtained from multi-person tracking, motion analysis, face recognition, and object detection provide the visual counterpart of the acoustic events to be detected. A multimodal data fusion at score level is carried out using two approaches: weighted mean average and fuzzy integral. Finally, a multimodal database containing a rich variety of acoustic events has been recorded including manual annotations of the data. A set of metrics allow assessing the performance of the presented algorithms. This dataset is made publicly available for research purposes.
引用
收藏
页码:840 / 847
页数:8
相关论文
共 50 条
  • [41] A scene-based general baseball event detection system
    Lien, Cheng-Chang
    Chiang, Chiu-Lung
    IMECS 2006: INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, 2006, : 507 - +
  • [42] TUT Database for Acoustic Scene Classification and Sound Event Detection
    Mesaros, Annamaria
    Heittola, Toni
    Virtanen, Tuomas
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 1128 - 1132
  • [43] Towards Understanding the Role of the Human in Event Log Extraction
    Dani, Vinicius Stein
    Leopold, Henrik
    van der Werf, Jan Martijn E. M.
    Lu, Xixi
    Beerepoot, Iris
    Koorn, Jelmer J.
    Reijers, Hajo A.
    BUSINESS PROCESS MANAGEMENT WORKSHOPS, BPM 2021, 2022, 436 : 86 - 98
  • [44] Audiovisual gunshot event recognition
    Chen, Cheng-Yao
    Abdallah, Ahmed
    Wolf, Wayne
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 4807 - +
  • [45] DRUformer: Enhancing Driving Scene Important Object Detection With Driving Scene Relationship Understanding
    Niu, Yingjie
    Ding, Ming
    Fujii, Keisuke
    Ohtani, Kento
    Carballo, Alexander
    Takeda, Kazuya
    IEEE ACCESS, 2024, 12 : 67589 - 67599
  • [46] ACTION, MISE-EN-SCENE, EVENT OR AUDIOVISUAL CONSTRUCTION? A BRIEF INTRODUCTION TO THE CONCEPT OF PERFORMANCE IN HUMANITIES AND MUSIC
    San Cristobal Opazo, Ursula Pilar
    CUADERNOS DE MUSICA ARTES VISUALES Y ARTES ESCENICAS, 2018, 13 (01): : 207 - 231
  • [47] Detection of Event of Interest for Satellite Video Understanding
    Gu, Yanfeng
    Wang, Tengfei
    Jin, Xudong
    Gao, Guoming
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (11): : 7860 - 7871
  • [48] Towards 3D Scene Understanding Using Differentiable Rendering
    Periyasamy A.S.
    Behnke S.
    SN Computer Science, 4 (3)
  • [49] Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework
    Li, Li-Jia
    Socher, Richard
    Li Fei-Fei
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 2036 - 2043
  • [50] Towards Scene Understanding with Detailed 3D Object Representations
    Zia, M. Zeeshan
    Stark, Michael
    Schindler, Konrad
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 112 (02) : 188 - 203