SPEECH OVERLAP DETECTION USING CONVOLUTIVE NON-NEGATIVE SPARSE CODING: NEW IMPROVEMENTS AND INSIGHTS

被引:0
|
作者
Geiger, Juergen T. [1 ]
Vipperla, Ravichander [2 ]
Evans, Nicholas [2 ]
Schuller, Bjoern [1 ]
Rigoll, Gerhard [1 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-8000 Munich, Germany
[2] EURECOM, Multimedia Commun Dept, Sophia Antipolis, France
关键词
speech overlap detection; convolutive non-negative sparse coding; speaker diarization;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents recent advances in the application of convolutive non-negative sparse coding (CNSC) to the problem of overlap detection in the context of conference meetings and speaker diarization. CNSC is used to project a mixed speaker signal onto separate speaker bases and hence to detect intervals of competing speech. We present new energy ratio and total energy features which give signicant improvements over our previous work. The system is assessed using a subset of the AMI meeting corpus. We report results which are comparable to the state of the art which support the potential of a new approach to overlap detection. An analysis of system performance highlights the importance of further work to addresses weaknesses in detecting particularly short segments of overlapping speech.
引用
收藏
页码:340 / 344
页数:5
相关论文
共 50 条
  • [41] Combining Non-negative Matrix Factorization and Sparse Coding for Functional Brain Overlapping Community Detection
    X. Li
    Z. Hu
    H. Wang
    Cognitive Computation, 2018, 10 : 991 - 1005
  • [42] Single-Channel Speech Separation using Sparse Non-Negative Matrix Factorization
    Schmidt, Mikkel N.
    Olsson, Rasmus K.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2614 - 2617
  • [43] Combining Non-negative Matrix Factorization and Sparse Coding for Functional Brain Overlapping Community Detection
    Li, X.
    Hu, Z.
    Wang, H.
    COGNITIVE COMPUTATION, 2018, 10 (06) : 991 - 1005
  • [44] Image Classification by Non-Negative Sparse Coding, Low-Rank and Sparse Decomposition
    Zhang, Chunjie
    Liu, Jing
    Tian, Qi
    Xu, Changsheng
    Lu, Hanqing
    Ma, Songde
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 1673 - 1680
  • [45] Sparse coding of human motion trajectories with non-negative matrix factorization
    Vollmer, Christian
    Hellbach, Sven
    Eggert, Julian
    Gross, Horst-Michael
    NEUROCOMPUTING, 2014, 124 : 22 - 32
  • [46] A New Denoising Approach for Sound Signals Based on Non-negative Sparse Coding of Power Spectra
    Shang, Li
    Cao, Fengwen
    Zhang, Jinfeng
    ADVANCES IN NEURAL NETWORKS - ISNN 2008, PT 2, PROCEEDINGS, 2008, 5264 : 359 - 366
  • [47] Convex Hull Convolutive Non-negative Matrix Factorization Based Speech Enhancement For Multimedia Communication
    Wang, Dongxia
    Cui, Jie
    Wang, Jinghua
    Tan, Huan
    Xu, Ming
    2022 6TH INTERNATIONAL CONFERENCE ON CRYPTOGRAPHY, SECURITY AND PRIVACY, CSP 2022, 2022, : 138 - 142
  • [48] SPARSE NON-NEGATIVE DECOMPOSITION OF SPEECH POWER SPECTRA FOR FORMANT TRACKING
    Durrieu, Jean-Louis
    Thiran, Jean-Philippe
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5260 - 5263
  • [49] Constrained non-negative sparse coding using learnt instrument templates for realtime music transcription
    Carabias-Orti, J. J.
    Rodriguez-Serrano, F. J.
    Vera-Candeas, P.
    Canadas-Quesada, F. J.
    Ruiz-Reyes, N.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (07) : 1671 - 1680
  • [50] Natural image compression using an extended non-negative sparse coding neural network technique
    Shang, L
    Huang, DS
    Zheng, CH
    Sun, ZL
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 1866 - 1871