Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News

被引:27
|
作者
Xie, Lei [1 ]
Zheng, Lilei [1 ]
Liu, Zihan [2 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710129, Peoples R China
[2] City Univ Hong Kong, Sch Creat Media, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Laplacian Eigenmaps (LE); spoken document retrieval; story segmentation; topic segmentation; IMAGE SEGMENTATION; TEXT SEGMENTATION; SPEECH; ALGORITHM; PROSODY; CUES;
D O I
10.1109/TASL.2011.2160853
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose Laplacian Eigenmaps (LE)-based approaches to automatic story segmentation on speech recognition transcripts of broadcast news. We reinforce story boundaries by applying LE analysis to sentence connective strength matrix and reveal the intrinsic geometric structure of stories. Specifically, we construct a Euclidean space in which each sentence is mapped to a vector. As a result, the original inter-sentence connective strength is reflected by the Euclidean distances between the corresponding vectors and cohesive relations between sentences become geometrically evident. Taking advantage of LE, we present three story segmentation approaches: LE-TextTiling, spectral clustering and LE-DP. In LE-DP, we formalize story segmentation as a straightforward criterion minimization problem and give a fast dynamic programming solution to it. Extensive story segmentation experiments on three corpora demonstrate that the proposed LE-based approaches achieve superior performances and significantly outperform several state-of-the-art methods. For instance, LE-TextTiling obtains a relative F1-measure increase of 17.8% on CCTV Mandarin BN corpus as compared to conventional TextTiling; LE-DP achieves a high F1-measure of 0.7460, which significantly outperforms a recent CRF-prosody approach with an F1-measure of 0.6783 on TDT2 Mandarin BN corpus.
引用
收藏
页码:276 / 289
页数:14
相关论文
共 50 条
  • [31] On the regularized Laplacian eigenmaps
    Cao, Ying
    Chen, Di-Rong
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2012, 142 (07) : 1627 - 1643
  • [32] A Note on Laplacian Eigenmaps
    潘荣英
    张晓东
    Journal of Shanghai Jiaotong University(Science), 2009, 14 (05) : 632 - 634
  • [33] Laplacian Eigenmaps of Graphs
    Wang Tianfei
    Yang Jin
    Li Bin
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE OF MODELLING AND SIMULATION (ICMS2011), VOL 1, 2011, : 247 - 250
  • [34] A note on laplacian eigenmaps
    Pan R.-Y.
    Zhang X.-D.
    Journal of Shanghai Jiaotong University (Science), 2009, 14 (5) : 632 - 634
  • [35] Automatic transcription of Broadcast News
    Chen, SS
    Eide, E
    Gales, MJF
    Gopinath, RA
    Kanvesky, D
    Olsen, P
    SPEECH COMMUNICATION, 2002, 37 (1-2) : 69 - 87
  • [36] Generalized Laplacian Eigenmaps
    Zhu, Hao
    Koniusz, Piotr
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [37] Varying Input Segmentation for Story Boundary Detection in English, Arabic and Mandarin Broadcast News
    Rosenberg, Andrew
    Sharifi, Mehrbod
    Hirschberg, Julia
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1745 - 1748
  • [38] Automatic Story Segmentation for TV News Video Using Multiple Modalities
    Dumont, Emilie
    Quenot, Georges
    INTERNATIONAL JOURNAL OF DIGITAL MULTIMEDIA BROADCASTING, 2012, 2012
  • [39] UNSUPERVISED BROADCAST NEWS STORY SEGMENTATION USING DISTANCE DEPENDENT CHINESE RESTAURANT PROCESSES
    Yang, Chao
    Xie, Lei
    Zhou, Xiangzeng
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [40] Automatic Identification of Broadcast News Story Boundaries Using the Unification Method for Popular Nouns
    Khalaf, Zainab Ali
    Ping, Tan Tien
    2013 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2013, : 577 - 584