Mining Summary of Short Text with Centroid Similarity Distance

被引:1
|
作者
Franciscus, Nigel [1 ]
Wang, Junhu [1 ]
Stantic, Bela [1 ]
机构
[1] Inst Integrated & Intelligent Syst, Brisbane, Qld, Australia
来源
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019 | 2019年 / 11888卷
关键词
Text summarization; Short text; Word embeddings;
D O I
10.1007/978-3-030-35231-8_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization aims at producing a concise summary that preserves key information. Many textual inputs are short and do not fit with the standard longer text-based techniques. Most of the existing short text summarization approaches rely on metadata information such as the authors or reply networks. However, not all raw textual data can provide such information. In this paper, we present our method to summarize short text using a centroid-based method with word embeddings. In particular, we consider the task when there is no metadata information other than the text itself. We show that the centroid embeddings approach can be applied to short text to capture semantically similar sentences for summarization. With further clustering strategy, we were able to identify relevant sub-topics that further improves the context diversity in the overall summary. The empirical evaluation demonstrates that our approach can outperform other methods on two annotated LREC track dataset.
引用
收藏
页码:447 / 461
页数:15
相关论文
共 50 条
  • [31] Measuring Patent Similarity Based on Text Mining and Image Recognition
    Lin, Wenguang
    Yu, Wenqiang
    Xiao, Renbin
    SYSTEMS, 2023, 11 (06):
  • [32] Text mining of bilingual parallel corpora with a measure of semantic similarity
    Lee, CH
    Yang, HC
    2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE, 2002, : 470 - 475
  • [33] StABLE: Analyzing Player Movement Similarity Using Text Mining
    Fragoso, Luana
    Stanley, Kevin G.
    2021 IEEE CONFERENCE ON GAMES (COG), 2021, : 437 - 444
  • [34] An algorithm for semantic similarity of short text based on WordNet
    Zhai, Yan-Dong
    Wang, Kang-Ping
    Zhang, Dong-Na
    Hunag, Lan
    Zhou, Chun-Guang
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2012, 40 (03): : 617 - 620
  • [35] A Fast and Efficient Semantic Short Text Similarity Metric
    Croft, David
    Coupland, Simon
    Shell, Jethro
    Brown, Stephen
    2013 13TH UK WORKSHOP ON COMPUTATIONAL INTELLIGENCE (UKCI), 2013, : 221 - 227
  • [36] NGram Approach for Semantic Similarity on Arabic Short Text
    Al-Mahmoud, Rana Husni
    Sharieh, Ahmad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (11) : 857 - 866
  • [37] Short Text Computing Based on Lexical Similarity Model
    Alhadi, Arifah Che
    Deraman, Aziz
    Jalil, Masita Masila Abdul
    Yussof, Wan Nural Jawahir Wan
    Noah, Shahrul Azman Mohd
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2019, 2019, 1078 : 355 - 366
  • [38] A Short Text Similarity Measure Based on Hidden Topics
    Chen, Hong-chao
    Guo, Xiao-hua
    Liu, Ling-qiang
    Zhu, Xin-hua
    COMPUTER SCIENCE AND TECHNOLOGY (CST2016), 2017, : 1101 - 1108
  • [39] Short Text Similarity Calculation Using Semantic Information
    Pu, Haoyu
    Fei, Gaolei
    Zhao, Hailin
    Hu, Guangmin
    Jiao, Chengbo
    Xu, Zhoujun
    2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM), 2017, : 144 - 150
  • [40] Improving Short Text Clustering by Similarity Matrix Sparsification
    Rakib, Md Rashadul Hasan
    Jankowska, Magdalena
    Zeh, Norbert
    Milios, Evangelos
    PROCEEDINGS OF THE ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG 2018), 2018,