Speaker Identification for the Analysis of Joint Attention in Video

被引:0
|
作者
Gonzalez Contreras, Carlos Eduardo [1 ]
De-la-Torre, Miguel [1 ]
Gonzalez Becerra, Victor Hugo [1 ]
Avila-George, Himer [1 ]
Hernandez Palacio, Raul [2 ]
机构
[1] Univ Guadalajara, Ameca, Mexico
[2] Univ Autonoma Estado Hidalgo, Pachuca, Hidalgo, Mexico
来源
2019 8TH INTERNATIONAL CONFERENCE ON SOFTWARE PROCESS IMPROVEMENT (CIMPS) | 2019年
关键词
Joint attention; speaker identification; MFCC; GMM; SVM;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Joint attention (AC) is a skill of human beings essential for the development of the individual, including language learning. Experimental studies in AC commonly involve the analysis of video recordings of scenes with interactions between individuals, and some elements are manually registered, including the intervention of each one. In this work, the design of a speaker identification system is proposed for the analysis of AC, which provides the sequence of interventions from each speaker in videos from AC scenarios. In order to support implementation, a comparative of the most common techniques for speaker identification is provided. Such techniques include the Mel Frequency Cepstral Coefficients (MFCC) and the addition of the MFCC+deltaMFCC. For classification, the Gaussian mixture models (GMM) and support vector machines (SVM) were employed. Results after a 5-fold cross validation process, with 30 audio segments with a duration of 3-4 seconds, throw an accuracy close to 90%, using MFCC+deltaMFCC with SVM. This result evidences the implementation feasibility of the proposed system.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Joint Speech Enhancement and Speaker Identification Using Approximate Bayesian Inference
    Maina, Ciira Wa
    Walsh, John MacLaren
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1517 - 1529
  • [42] Investigation into variants of Joint Factor Analysis for speaker recognition
    Burget, Lukas
    Matejka, Pavel
    Hubeika, Valiantsina
    Cernocky, Jan Honza
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1267 - 1270
  • [43] Joint factor analysis versus eigenchannels in speaker recognition
    Kenny, Patrick
    Boulianne, Gilles
    Ouellet, Pierre
    Dumouchel, Pierre
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1435 - 1447
  • [44] New Developments in Joint Factor Analysis for Speaker Verification
    Aronowitz, Hagai
    Barkan, Oren
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 136 - 139
  • [45] VARIATIONAL BAYESIAN JOINT FACTOR ANALYSIS FOR SPEAKER VERIFICATION
    Zhao, Xianyu
    Dong, Yuan
    Zhao, Jian
    Lu, Liang
    Liu, Jiqing
    Wang, Haila
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4049 - +
  • [46] JOINT BAYESIAN GAUSSIAN DISCRIMINANT ANALYSIS FOR SPEAKER VERIFICATION
    Wang, Yiyan
    Xu, Haotian
    Ou, Zhijian
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5390 - 5394
  • [47] Forensic Phonetic identification and linguistic analysis of the speaker
    Varosanec-Skaric, Gordana
    Kisicek, Gabrijela
    SUVREMENA LINGVISTIKA, 2012, 38 (73): : 89 - 108
  • [48] Speaker Identification through Spectral Entropy Analysis
    Camarena-Ibarrola, Antonio
    Luque, Fernando
    Chavez, Edgar
    2017 IEEE INTERNATIONAL AUTUMN MEETING ON POWER, ELECTRONICS AND COMPUTING (ROPEC), 2017,
  • [49] The Role of Age in Factor Analysis for Speaker Identification
    Lei, Yun
    Hansen, John H. L.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2327 - 2330