Cognitive state classification in a spoken tutorial dialogue system

被引：7

作者：

Zhang, Tong ^{[1
]}

Hasegawa-Johnson, Mark ^{[1
]}

Levinson, Stephen E. ^{[1
]}

机构：

[1] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA

来源：

SPEECH COMMUNICATION | 2006年 / 48卷 / 06期

基金：

美国国家科学基金会;

关键词：

intelligent tutoring system; user affect recognition; spoken language processing;

D O I：

10.1016/j.specom.2005.09.006

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper addresses the manual and automatic labelling, from spontaneous speech, of a particular type of user affect that we call the cognitive state in a tutorial dialogue system with students of primary and early middle school ages. Our definition of the cognitive state is based on analysis of children's spontaneous speech, which is acquired during Wizard-of-Oz simulations of an intelligent math and physics tutor. The cognitive states of children are categorized into three classes: confidence, puzzlement, and hesitation. The manual labelling of cognitive states had an inter-transcriber agreement of kappa score 0.93. The automatic cognitive state labels are generated by classifying prosodic features, text features, and spectral features. Text features are generated from an automatic speech recognition (ASR) system; features include indicator functions of keyword classes and part-of-speech sequences. Spectral features are created based on acoustic likelihood scores of a cognitive state-dependent ASR system, in which phoneme models are adapted to utterances labelled for a particular cognitive state. The effectiveness of the proposed method has been tested on both manually and automatically transcribed speech, and the test yielded very high correctness: 96.6% for manually transcribed speech and 95.7% for automatically recognized speech. Our study shows that the proposed spectral features greatly outperformed the other types of features in the cognitive state classification experiments. Our study also shows that the spectral and prosodic features derived directly from speech signals were very robust to speech recognition errors, much more than the lexical and part-of-speech based features. (C) 2005 Elsevier B.V. All rights reserved.

引用

页码：616 / 632

页数：17

共 50 条

[31] Spoken Dialogue System Design in 3 Weeks
Valenta, Tomas
Svec, Jan
Smidl, Lubos
TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 624 - 631
[32] Integrated scoring function for a spoken dialogue system
Chiang, Tung-Hui
Lin, Yi-Chung
International Conference on Signal Processing Proceedings, ICSP, 1998, 1 : 617 - 620
[33] A Persian Spoken Dialogue System using POMDPs
Mahmoudi, Hesam
Homayounpour, Mohammad Mehdi
2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2015, : 217 - 221
[34] Integration of spoken dialogue system and ubiquitous computing
Arakawa, Yutaka
2019 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS (PERCOM WORKSHOPS), 2019, : 874 - 874
[35] An integrated scoring function for a spoken dialogue system
Chiang, TH
Lin, YC
ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 617 - 620
[36] Utterance Intent Classification for Spoken Dialogue System with Data-Driven Untying of Recursive Autoencoders
Kato, Tsuneo
Nagai, Atsushi
Noda, Naoki
Wu, Jianming
Yamamoto, Seiichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (06) : 1197 - 1205
[37] User modeling for spoken dialogue system evaluation
Eckert, W
Levin, E
Pieraccini, R
1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 80 - 87
[38] Designing and Evaluating an Adaptive Spoken Dialogue System
Diane J. Litman
Shimei Pan
User Modeling and User-Adapted Interaction, 2002, 12 : 111 - 137
[39] SHTQS: a telephonebased Chinese spoken dialogue system
Mao Jiaju
Journal of Systems Engineering and Electronics, 2005, (04) : 881 - 885
[40] Audiovisual Interface for Czech Spoken Dialogue System
Ircing, Pavel
Romportl, Jan
Loose, Zdenek
2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 526 - 529

← 1 2 3 4 5 →