Task Estimation Using Latent Semantic Analysis of Visual Scenes and Spoken Words

被引:0
|
作者
Kimura, Masashi [1 ]
Sawada, Shinta [1 ]
Iribe, Yurie [2 ]
Katsurada, Kouichi [1 ]
Nitta, Tsuneo [1 ]
机构
[1] Toyohashi Univ Technol, Toyohashi, Aichi, Japan
[2] Toyohashi Univ Technol, Informat & Media Ctr, Toyohashi, Aichi, Japan
关键词
multimodal processing; latent semantic analysis; task estimation;
D O I
10.1002/ecj.11560
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a task estimation method based on multiple subspaces extracted from multimodal information of image objects in visual scenes and spoken words in dialogue appearing in the same task. The multiple subspaces are obtained by using latent semantic analysis (LSA). In the proposed method, a task vector composed of spoken words and the frequencies of image-object appearances are extracted first, and then similarities among the input task vector and reference subspaces of different tasks are compared. Experiments are conducted on the identification of game tasks. The experimental results show that the proposed method with multimodal information outperforms the method in which only the single modality of image or spoken dialogue is applied. The proposed method achieves accurate performance even if less spoken dialogue is applied.
引用
收藏
页码:33 / 42
页数:10
相关论文
共 50 条
  • [1] Task estimation using latent semantic analysis of visual scenes and spoken words
    Kimura, Masashi
    Sawada, Shinta
    Iribe, Yurie
    Katsurada, Kouichi
    Nitta, Tsuneo
    Kimura, M. (kimura@vox.cs.tut.ac.jp), 2012, Institute of Electrical Engineers of Japan (132) : 1473 - 1480
  • [2] Semantic Grounding of Novel Spoken Words in the Primary Visual Cortex
    Garagnani, Max
    Kirilina, Evgeniya
    Pulvermuller, Friedemann
    FRONTIERS IN HUMAN NEUROSCIENCE, 2021, 15
  • [3] Improved spoken document summarization using Probabilistic Latent Semantic Analysis (PLSA)
    Kong, Sheng-Yi
    Lee, Lin-shan
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 941 - 944
  • [4] Crossmodal Semantic Priming by Naturalistic Sounds and Spoken Words Enhances Visual Sensitivity
    Chen, Yi-Chuan
    Spence, Charles
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2011, 37 (05) : 1554 - 1568
  • [5] Learning Spoken Document Similarity and Recommendation using Supervised Probabilistic Latent Semantic Analysis
    Thambiratnam, K.
    Seide, F.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2840 - 2843
  • [6] Visual Explanation of Mathematics in Latent Semantic Analysis
    Shirota, Yukari
    Chakraborty, Basabi
    2015 IIAI 4TH INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS (IIAI-AAI), 2015, : 423 - 428
  • [7] Performance Analysis of Bag of Visual Words for Recognition of Complex Scenes
    Hernando-Rios, Luis G.
    Angel Garcia-Garcia, Miguel
    Puig-Valls, Domenec
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2016, 288 : 51 - 57
  • [8] Identifying novel information using latent semantic analysis in the WiQA task at CLEF 2006
    Sutcliffe, Richard F. E.
    Steinberger, Josef
    Kruschwitz, Udo
    Alexandrov-Kabadjov, Mijail
    Poesio, Massimo
    EVALUATION OF MULTILINGUAL AND MULTI-MODAL INFORMATION RETRIEVAL, 2007, 4730 : 541 - +
  • [9] Storylines: Visual exploration and analysis in latent semantic spaces
    Zhu, Weizhong
    Chen, Chaomei
    COMPUTERS & GRAPHICS-UK, 2007, 31 (03): : 338 - 349
  • [10] Using Latent Semantic Indexing for Morph-based Spoken Document Retrieval
    Turunen, Ville T.
    Kurimo, Mikko
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 341 - 344