The Spot the Difference corpus: a multi-modal corpus of spontaneous task oriented spoken interactions

被引:0
|
作者
Lopes, Jose [1 ]
Hemmingsson, Nils [1 ]
Astrand, Oliver [1 ]
机构
[1] KTH Royal Inst Technol, Stockholm, Sweden
来源
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018) | 2018年
关键词
Dialogues; Spontaneous; Multi-modal;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper describes the Spot the Difference Corpus which contains 54 interactions between pairs of subjects interacting to find differences in two very similar scenes. The setup used, the participants' metadata and details about collection are described. We are releasing this corpus of task-oriented spontaneous dialogues. This release includes rich transcriptions, annotations, audio and video. We believe that this dataset constitutes a valuable resource to study several dimensions of human communication that go from turn-taking to the study of referring expressions. In our preliminary analyses we have looked at task success (how many differences were found out of the total number of differences) and how it evolves over time. In addition we have looked at scene complexity provided by the RGB components' entropy and how it could relate to speech overlaps, interruptions and the expression of uncertainty. We found there is a tendency that more complex scenes have more competitive interruptions.
引用
收藏
页码:1939 / 1945
页数:7
相关论文
共 50 条
  • [31] Grammars of spoken English: New outcomes of corpus-oriented research
    Leech, G
    LANGUAGE LEARNING, 2000, 50 (04) : 675 - 724
  • [32] Task-Oriented Multi-Modal Mutual Learning for Vision-Language Models
    Long, Sifan
    Zhao, Zhen
    Yuan, Junkun
    Tan, Zichang
    Liu, Jiangjiang
    Zhou, Luping
    Wang, Shengsheng
    Wang, Jingdong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21902 - 21912
  • [33] VEMO: A Versatile Elastic Multi-modal Model for Search-Oriented Multi-task Learning
    Fei, Nanyi
    Jiang, Hao
    Lu, Haoyu
    Long, Jinqiang
    Dai, Yanqi
    Fan, Tuo
    Cao, Zhao
    Lu, Zhiwu
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT I, 2024, 14608 : 56 - 72
  • [34] Corpus Analysis of Spoken Smart-Home Interactions with Older Users
    Moeller, Sebastian
    Goedde, Florian
    Wolters, Maria
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 735 - 740
  • [35] MEDIA: a semantically annotated corpus of task oriented dialogs in French
    Bonneau-Maynard, Helene
    Quignard, Matthieu
    Denis, Alexandre
    LANGUAGE RESOURCES AND EVALUATION, 2009, 43 (04) : 329 - 354
  • [36] MultiMAE: Multi-modal Multi-task Masked Autoencoders
    Bachmann, Roman
    Mizrahi, David
    Atanov, Andrei
    Zamir, Amir
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 348 - 367
  • [37] A multi-purpose audio-visual corpus for multi-modal Persian speech recognition: The Arman-AV dataset
    Peymanfard, Javad
    Heydarian, Samin
    Lashini, Ali
    Zeinali, Hossein
    Mohammadi, Mohammad Reza
    Mozayani, Nasser
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [38] Exploiting Multi-Modal Interactions: A Unified Framework
    Li, Ming
    Xue, Xiao-Bing
    Zhou, Zhi-Hua
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1120 - 1125
  • [39] FARMI: A FrAmework for Recording Multi-Modal Interactions
    Jonell, Patrik
    Bystedt, Mattias
    Fallgren, Per
    Kontogiorgos, Dimosthenis
    Lopes, Jose
    Malisz, Zofia
    Mascarenhas, Samuel
    Oertel, Catharine
    Raveh, Eran
    Shore, Todd
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3969 - 3974
  • [40] Multi-Modal Interactions of Mixed Reality Framework
    Omary, Danah
    Mehta, Gayatri
    17TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE, DCAS 2024, 2024,