Automatic Topic Modeling for Single Document Short Texts

被引:1
|
作者
Sajid, Anamta [1 ]
Jan, Sadaqat [1 ]
Shah, Ibrar A. [1 ]
机构
[1] Univ Engn & Technol, Dept Comp Software Engn, Mardan Campus, Mardan, Pakistan
关键词
Data mining; text mining; topic modeling;
D O I
10.1109/FIT.2017.00020
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents a novel approach to automate the process of extracting topic and main title from a single-document short text. The proposed approach uses online text mining and Natural Language Processing techniques. The title of any text provides an efficient way to concisely grasp the overview of the contents in the text by giving a glance on its main heading only, which is quicker than reading the summary. In this paper, three different mechanisms have been proposed, implemented and compared to find the best approach for automatic extraction of a topic that is more relevant to the overall event explained in the text. The proposed system is evaluated against fifteen news articles from New York Times. The significance of the paper is twofold: Firstly, these automatic topic extraction techniques can be used further for document classification, document relevancy and similarity, summarization, comprehensive grasp of any event and finding novelty in outsized and scattered text data by scanning titles. Secondly, it can be used as a roadmap for the new researchers by using this detailed analysis of various data mining techniques. The experimental results show that the Nouns are more related, reliable, and suitable words for finding the topic of the text.
引用
收藏
页码:70 / 75
页数:6
相关论文
共 50 条
  • [31] BATS: A Spectral Biclustering Approach to Single Document Topic Modeling and Segmentation
    Wu, Qiong
    Hare, Adam
    Wang, Sirui
    Tu, Yuwei
    Liu, Zhenming
    Brinton, Christopher G.
    Li, Yanhua
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (05)
  • [32] RESEARCH ON CHINESE MULTI-DOCUMENT HIERARCHICAL TOPIC MODELING AUTOMATIC EVALUATION METHODS
    Liu, Yu
    Li, Lei
    Wan, Shuhong
    Gao, Zhiqiao
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 444 - 449
  • [33] The Impact of Weighting Schemes and Stemming Process on Topic Modeling of Arabic Long and Short Texts
    Ma, Tinghuai
    Al-Sabri, Raeed
    Zhang, Lejun
    Marah, Bockarie
    Al-Nabhan, Najla
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (06)
  • [34] Dirichlet Multinomial Mixture with Variational Manifold Regularization: Topic Modeling over Short Texts
    Li, Ximing
    Zhang, Jiaojiao
    Ouyang, Jihong
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7884 - 7891
  • [35] Topic Modeling for Short Texts with Co-occurrence Frequency-based Expansion
    Pedrosa, Gabriel
    Pita, Marcelo
    Bicalho, Paulo
    Lacerda, Anisio
    Pappa, Gisele L.
    PROCEEDINGS OF 2016 5TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2016), 2016, : 277 - 282
  • [36] Topic Modeling with Document Relative Similarities
    Du, Jianguang
    Jiang, Jing
    Song, Dandan
    Liao, Lejian
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3469 - 3475
  • [37] Topic Discovery for Streaming Short Texts with CTM
    Xu, Yunfeng
    Xu, Hua
    Zhu, Longxia
    Hao, Hanyong
    Deng, Junhui
    Sun, Xiaomin
    Bai, Xiaoli
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [38] Sparse Biterm Topic Model for Short Texts
    Zhu, Bingshan
    Cai, Yi
    Zhang, Huakui
    WEB AND BIG DATA, APWEB-WAIM 2021, PT I, 2021, 12858 : 227 - 241
  • [39] Automatic indexing and abstracting of document texts.
    Karamuftuoglu, M
    JOURNAL OF DOCUMENTATION, 2001, 57 (03) : 460 - 461
  • [40] Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling
    Alami, Nabil
    Meknassi, Mohammed
    En-nahnahi, Noureddine
    El Adlouni, Yassine
    Ammor, Ouafae
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 172