Hypothesis testing for evaluating a multimodal pattern recognition framework applied to speaker detection

被引:8
|
作者
Besson, Patricia [1 ]
Kunt, Murat [1 ]
机构
[1] Ecole Polytech Fed Lausanne, ITS, CH-1015 Lausanne, Switzerland
关键词
Mutual Information; Audio Signal; Mouth Region; Audio Feature; Video Feature;
D O I
10.1186/1743-0003-5-11
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background: Speaker detection is an important component of many human-computer interaction applications, like for example, multimedia indexing, or ambient intelligent systems. This work addresses the problem of detecting the current speaker in audio-visual sequences. The detector performs with few and simple material since a single camera and microphone meets the needs. Method: A multimodal pattern recognition framework is proposed, with solutions provided for each step of the process, namely, the feature generation and extraction steps, the classification, and the evaluation of the system performance. The decision is based on the estimation of the synchrony between the audio and the video signals. Prior to the classification, an information theoretic framework is applied to extract optimized audio features using video information. The classification step is then defined through a hypothesis testing framework in order to get confidence levels associated to the classifier outputs, allowing thereby an evaluation of the performance of the whole multimodal pattern recognition system. Results: Through the hypothesis testing approach, the classifier performance can be given as a ratio of detection to false-alarm probabilities. Above all, the hypothesis tests give means for measuring the whole pattern recognition process effciency. In particular, the gain offered by the proposed feature extraction step can be evaluated. As a result, it is shown that introducing such a feature extraction step increases the ability of the classifier to produce good relative instance scores, and therefore, the performance of the pattern recognition process. Conclusion: The powerful capacities of hypothesis tests as an evaluation tool are exploited to assess the performance of a multimodal pattern recognition process. In particular, the advantage of performing or not a feature extraction step prior to the classification is evaluated. Although the proposed framework is used here for detecting the speaker in audiovisual sequences, it could be applied to any other classification task involving two spatio-temporal co-occurring signals.
引用
收藏
页数:8
相关论文
共 49 条
  • [41] Explainable Feature Extraction and Prediction Framework for 3D Image Recognition Applied to Pneumonia Detection
    Pintelas, Emmanuel
    Livieris, Ioannis E.
    Pintelas, Panagiotis
    ELECTRONICS, 2023, 12 (12)
  • [43] Dynamic graph topology generating mechanism: Framework for feature-level multimodal information fusion applied to lower-limb activity recognition
    Yu, Zidong
    Zhang, Changhe
    Wang, Xiaoyun
    Chao, Deng
    Liu, Yuan
    Yu, Zeyu
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [44] PATTERN-RECOGNITION TECHNIQUES APPLIED TO ACOUSTIC DETECTION OF LIQUID-METAL FAST BREEDER REACTOR COOLING DEFECTS
    BRUNET, M
    DUBUISSON, B
    NUCLEAR SCIENCE AND ENGINEERING, 1983, 84 (04) : 373 - 379
  • [45] WAVEFORM PATTERN RECOGNITION APPLIED TO RAPID DETECTION OF WALL-THINNING IN PIPES: A SIMULATION-BASED CASE STUDY
    Alobaidi, Wissam M.
    Sandgren, Eric
    Al-Rizzo, Hussain M.
    PROCEEDINGS OF THE 11TH INTERNATIONAL PIPELINE CONFERENCE, 2016, VOL 3, 2017,
  • [46] FORMULATION OF PATTERN RECOGNITION FRAMEWORK-ANALYSIS AND DETECTION OF TYRE CRACKS UTILIZING INTEGRATED TEXTURE FEATURES AND ENSEMBLE LEARNING METHODS
    Mahesh, Vijayalakshmi Gopasandra Venkateshappa
    Joseph Raj, Alex Noel
    ADVANCES IN ELECTRICAL AND ELECTRONIC ENGINEERING, 2023, 21 (02) : 127 - 143
  • [47] Real-time drone detection framework based on advanced texture feature extraction and pattern recognition model using GUI
    Noha Hussen
    Mofreh Salem
    Ali I. Eldesouky
    Noha Sakr
    Sally Elghamrawy
    Neural Computing and Applications, 2025, 37 (5) : 3435 - 3454
  • [48] Honey characterization and adulteration detection by pattern recognition applied on HPAEC-PAD profiles.: 1.: Honey floral species characterization
    Cordella, CBY
    Militao, JSLT
    Clément, MC
    Cabrol-Bass, D
    JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2003, 51 (11) : 3234 - 3242
  • [49] A novel RSW&TST framework of MCPs detection for abnormal pattern recognition on large-scale time series and pathological signals in epilepsy
    Qi, Jinpeng
    Zhu, Ying
    Pu, Fang
    Zhang, Ping
    PLOS ONE, 2021, 16 (12):