Audio-Visual Detection of Multiple Chirping Robots

被引:3
|
作者
Gribovskiy, Alexey [1 ]
Mondada, Francesco [1 ]
机构
[1] Ecole Polytech Fed Lausanne, LSRO, CH-1015 Lausanne, Switzerland
关键词
Microphone arrays; sound localization; audio-visual; multi-source; information fusion;
D O I
10.3233/978-1-58603-887-8-324
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Design, study, and control of mixed animals-robots societies are the fields of scientific exploration that can bring new opportunities for study and control of groups of social insects and animals and, in particular, for improvement of welfare and breeding conditions of domestic animals. Our long-term objective is to develop a mobile robot, socially acceptable by chickens and able to interact with them using appropriate communication channels. For interaction purposes the robot has to know positions of all birds in an experimental area and detect those uttering calls. In this paper, we present an audio-visual approach to locate the robots and animals on a scene and detect their calling activity. The visual tracking is provided by a marker-based tracker with help of an overhead camera. Sound localization is achieved by the beamforming approach using an array of sixteen microphones. Visual and sound information are probabilistically mixed to detect the calling activity. The experimental results demonstrate that our system is capable to detect the sound emission activity of multiple moving robots with 90% probability.
引用
收藏
页码:324 / 331
页数:8
相关论文
共 50 条
  • [1] Audio-visual event detection based on mining of semantic audio-visual labels
    Goh, KS
    Miyahara, K
    Radhakrishan, R
    Xiong, ZY
    Divakaran, A
    STORAGE AND RETRIEVAL METHODS AND APPLICATIONS FOR MULTIMEDIA 2004, 2004, 5307 : 292 - 299
  • [2] A Robust Audio-visual Speech Recognition Using Audio-visual Voice Activity Detection
    Tamura, Satoshi
    Ishikawa, Masato
    Hashiba, Takashi
    Takeuchi, Shin'ichi
    Hayamizu, Satoru
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2702 - +
  • [3] Audio-Visual Predictive Processing in the Perception of Humans and Robots
    Sarigul, Busra
    Urgen, Burcu A.
    INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2023, 15 (05) : 855 - 865
  • [4] Joint Audio-Visual Deepfake Detection
    Zhou, Yipin
    Lim, Ser-Nam
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14780 - 14789
  • [5] Audio-Visual Detection Benefits in the Rat
    Gleiss, Stephanie
    Kayser, Christoph
    PLOS ONE, 2012, 7 (09):
  • [6] Incongruence Detection in Audio-Visual Processing
    Havlena, Michal
    Heller, Jan
    Kayser, Hendrik
    Bach, Joerg-Hendrik
    Anemueller, Joern
    Pajdla, Tomas
    DETECTION AND IDENTIFICATION OF RARE AUDIOVISUAL CUES, 2012, 384 : 67 - +
  • [7] Audio-visual talking face detection
    Li, MK
    Li, DG
    Dimitrova, N
    Sethi, I
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL II, PROCEEDINGS, 2003, : 473 - 476
  • [8] Cooperative Audio-Visual System for Localizing Small Aerial Robots
    Rosa, Jose
    Basiri, Meysam
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 6064 - 6069
  • [9] An audio-visual distance for audio-visual speech vector quantization
    Girin, L
    Foucher, E
    Feng, G
    1998 IEEE SECOND WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1998, : 523 - 528
  • [10] Catching audio-visual mice:: The extrapolation of audio-visual speed
    Hofbauer, MM
    Wuerger, SM
    Meyer, GF
    Röhrbein, F
    Schill, K
    Zetzsche, C
    PERCEPTION, 2003, 32 : 96 - 96