Hypothesis testing for evaluating a multimodal pattern recognition framework applied to speaker detection

被引:8
|
作者
Besson, Patricia [1 ]
Kunt, Murat [1 ]
机构
[1] Ecole Polytech Fed Lausanne, ITS, CH-1015 Lausanne, Switzerland
关键词
Mutual Information; Audio Signal; Mouth Region; Audio Feature; Video Feature;
D O I
10.1186/1743-0003-5-11
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background: Speaker detection is an important component of many human-computer interaction applications, like for example, multimedia indexing, or ambient intelligent systems. This work addresses the problem of detecting the current speaker in audio-visual sequences. The detector performs with few and simple material since a single camera and microphone meets the needs. Method: A multimodal pattern recognition framework is proposed, with solutions provided for each step of the process, namely, the feature generation and extraction steps, the classification, and the evaluation of the system performance. The decision is based on the estimation of the synchrony between the audio and the video signals. Prior to the classification, an information theoretic framework is applied to extract optimized audio features using video information. The classification step is then defined through a hypothesis testing framework in order to get confidence levels associated to the classifier outputs, allowing thereby an evaluation of the performance of the whole multimodal pattern recognition system. Results: Through the hypothesis testing approach, the classifier performance can be given as a ratio of detection to false-alarm probabilities. Above all, the hypothesis tests give means for measuring the whole pattern recognition process effciency. In particular, the gain offered by the proposed feature extraction step can be evaluated. As a result, it is shown that introducing such a feature extraction step increases the ability of the classifier to produce good relative instance scores, and therefore, the performance of the pattern recognition process. Conclusion: The powerful capacities of hypothesis tests as an evaluation tool are exploited to assess the performance of a multimodal pattern recognition process. In particular, the advantage of performing or not a feature extraction step prior to the classification is evaluated. Although the proposed framework is used here for detecting the speaker in audiovisual sequences, it could be applied to any other classification task involving two spatio-temporal co-occurring signals.
引用
收藏
页数:8
相关论文
共 49 条
  • [31] Classifying Schizophrenia Using Multimodal Multivariate Pattern Recognition Analysis: Evaluating the Impact of Individual Clinical Profiles on the Neurodiagnostic Performance
    Cabral, Carlos
    Kambeitz-Ilankovic, Lana
    Kambeitz, Joseph
    Calhoun, Vince D.
    Dwyer, Dominic B.
    von Saldern, Sebastian
    Urquijo, Maria F.
    Falkai, Peter
    Koutsouleris, Nikolaos
    SCHIZOPHRENIA BULLETIN, 2016, 42 : S110 - S117
  • [32] COMPUTER-PATTERN RECOGNITION - AN AUTOMATED-METHOD FOR EVALUATING MOTOR-ACTIVITY AND TESTING FOR NEUROTOXICITY
    HOPPER, DL
    KERNAN, WJ
    WRIGHT, JR
    NEUROTOXICOLOGY AND TERATOLOGY, 1990, 12 (05) : 419 - 428
  • [33] EXTENSIONS OF SIGNAL-DETECTION ANALYSES APPLIED TO RECOGNITION MEMORY, PATTERN DISCRIMINATION, AND ENVIRONMENTAL PERCEPTION
    DANIEL, TC
    PSYCHONOMIC SCIENCE, 1972, 29 (4B): : 265 - &
  • [34] Pattern recognition in multivariate time series - A case study applied to fault detection in a gas turbine
    Fontes, Cristiano Hora
    Pereira, Otacilio
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 49 : 10 - 18
  • [35] Biosignal-Based Multimodal Emotion Recognition in a Valence-Arousal Affective Framework Applied to Immersive Video Visualization
    Pinto, Joana
    Fred, Ana
    da Silva, Hugo Placido
    2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 3577 - 3583
  • [36] Sequential hypothesis testing for reflected signal recognition of time-of-flight estimation in ultrasonic defect detection
    Chen, Hanxin
    Shang, Yunfei
    Sun, Kui
    INSIGHT, 2013, 55 (02) : 66 - 71
  • [37] GENETIC-ANALYSIS OF ADMIXTURE PROPORTIONS - A HYPOTHESIS-TESTING FRAMEWORK APPLIED TO A SOUTH-AMERICAN INDIAN EXAMPLE
    LONG, JC
    SMOUSE, PE
    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 1981, 54 (02) : 247 - 247
  • [38] A General Framework for Planning the Number of Items/ Subjects for Evaluating Cronbach's Alpha: Integration of Hypothesis Testing and Confidence Intervals
    Luh, Wei -Ming
    METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES, 2024, 20 (01) : 1 - 21
  • [39] Fully memristive spiking -neuron learning framework and its applications on pattern recognition and edge detection
    Tang, Zhiri
    Chen, Yanhua
    Ye, Shizhuo
    Hu, Ruihan
    Wang, Hao
    He, Jin
    Huang, Qijun
    Chang, Sheng
    NEUROCOMPUTING, 2020, 403 (403) : 80 - 87
  • [40] GI Bleeding Detection in Wireless Capsule Endoscopy Images Based on Pattern Recognition and A MapReduce Framework
    Jia, Xiao
    Cai, Lipeng
    Liu, Jing
    Dai, Wenxuan
    Meng, Max Q. -H.
    2016 IEEE INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING AND ROBOTICS (IEEE RCAR), 2016, : 266 - 271