Task Estimation Using Latent Semantic Analysis of Visual Scenes and Spoken Words

被引：0

作者：

Kimura, Masashi ^{[1
]}

Sawada, Shinta ^{[1
]}

Iribe, Yurie ^{[2
]}

Katsurada, Kouichi ^{[1
]}

Nitta, Tsuneo ^{[1
]}

机构：

[1] Toyohashi Univ Technol, Toyohashi, Aichi, Japan

[2] Toyohashi Univ Technol, Informat & Media Ctr, Toyohashi, Aichi, Japan

来源：

ELECTRONICS AND COMMUNICATIONS IN JAPAN | 2014年 / 97卷 / 06期

关键词：

multimodal processing; latent semantic analysis; task estimation;

D O I：

10.1002/ecj.11560

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we propose a task estimation method based on multiple subspaces extracted from multimodal information of image objects in visual scenes and spoken words in dialogue appearing in the same task. The multiple subspaces are obtained by using latent semantic analysis (LSA). In the proposed method, a task vector composed of spoken words and the frequencies of image-object appearances are extracted first, and then similarities among the input task vector and reference subspaces of different tasks are compared. Experiments are conducted on the identification of game tasks. The experimental results show that the proposed method with multimodal information outperforms the method in which only the single modality of image or spoken dialogue is applied. The proposed method achieves accurate performance even if less spoken dialogue is applied.

引用

页码：33 / 42

页数：10

共 50 条

[1] Task estimation using latent semantic analysis of visual scenes and spoken words
Kimura, Masashi
Sawada, Shinta
Iribe, Yurie
Katsurada, Kouichi
Nitta, Tsuneo
Kimura, M. (kimura@vox.cs.tut.ac.jp), 2012, Institute of Electrical Engineers of Japan (132) : 1473 - 1480
[2] Semantic Grounding of Novel Spoken Words in the Primary Visual Cortex
Garagnani, Max
Kirilina, Evgeniya
Pulvermuller, Friedemann
FRONTIERS IN HUMAN NEUROSCIENCE, 2021, 15
[3] Improved spoken document summarization using Probabilistic Latent Semantic Analysis (PLSA)
Kong, Sheng-Yi
Lee, Lin-shan
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 941 - 944
[4] Crossmodal Semantic Priming by Naturalistic Sounds and Spoken Words Enhances Visual Sensitivity
Chen, Yi-Chuan
Spence, Charles
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2011, 37 (05) : 1554 - 1568
[5] Learning Spoken Document Similarity and Recommendation using Supervised Probabilistic Latent Semantic Analysis
Thambiratnam, K.
Seide, F.
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2840 - 2843
[6] Visual Explanation of Mathematics in Latent Semantic Analysis
Shirota, Yukari
Chakraborty, Basabi
2015 IIAI 4TH INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS (IIAI-AAI), 2015, : 423 - 428
[7] Performance Analysis of Bag of Visual Words for Recognition of Complex Scenes
Hernando-Rios, Luis G.
Angel Garcia-Garcia, Miguel
Puig-Valls, Domenec
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2016, 288 : 51 - 57
[8] Identifying novel information using latent semantic analysis in the WiQA task at CLEF 2006
Sutcliffe, Richard F. E.
Steinberger, Josef
Kruschwitz, Udo
Alexandrov-Kabadjov, Mijail
Poesio, Massimo
EVALUATION OF MULTILINGUAL AND MULTI-MODAL INFORMATION RETRIEVAL, 2007, 4730 : 541 - +
[9] Storylines: Visual exploration and analysis in latent semantic spaces
Zhu, Weizhong
Chen, Chaomei
COMPUTERS & GRAPHICS-UK, 2007, 31 (03): : 338 - 349
[10] Using Latent Semantic Indexing for Morph-based Spoken Document Retrieval
Turunen, Ville T.
Kurimo, Mikko
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 341 - 344

← 1 2 3 4 5 →