TLATR: Automatic Topic Labeling Using Automatic (Domain-Specific) Term Recognition

被引:5
|
作者
Truica, Ciprian-Octavian [1 ]
Apostol, Elena-Simona [1 ]
机构
[1] Univ Politehn Bucuresti, Fac Automat Control & Comp, Dept Comp Sci & Engn, Bucharest 060042, Romania
来源
IEEE ACCESS | 2021年 / 9卷
关键词
Labeling; Task analysis; Mutual information; Semantics; Indexes; Computational modeling; Bit error rate; Automatic term recognition; automatic topic labeling evaluation; topic labeling; topic modeling; INFORMATION-RETRIEVAL; PROBABILISTIC MODEL;
D O I
10.1109/ACCESS.2021.3083000
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Topic modeling is a probabilistic graphical model for discovering latent topics in text corpora by using multinomial distributions of topics over words. Topic labeling is used to assign meaningful labels for the discovered topics. In this paper, we present a new topic labeling method that uses automatic term recognition to discover and assign relevant labels for each topic, i.e., TLATR (Topic Labeling using Automatic Term Recognition). TLATR uses domain-specific multi-terms that appear in the set of documents belonging to a topic. The multi-term having the highest score as determined by the automatic term recognition algorithm is chosen as the label for that topic. To evaluate TLATR, we use two real, publicly available datasets that contain scientific articles' abstracts. The topic label evaluation is done both automatically and using human annotators. For the automatic evaluation, we use Pointwise Mutual Information, Normalized Pointwise Mutual Information, and document similarity. For human evaluation, we employ the average rating method. Furthermore, we also evaluate the quality of the topic models using the Adjusted Rand Index. To prove that our novel method extracts relevant topic labels, we compare TLATR with two state-of-the-art methods, one supervised and one unsupervised, provided by the NETL Automatic Topic Labelling system. The experimental results show that our method outperforms or provides similar results with both NETL's supervised and unsupervised approaches.
引用
收藏
页码:76624 / 76641
页数:18
相关论文
共 50 条
  • [1] TLATR: Automatic Topic Labeling Using Automatic (Domain-Specific) Term Recognition
    Truica, Ciprian-Octavian
    Apostol, Elena-Simona
    IEEE Access, 2021, 9 : 76624 - 76641
  • [2] Improving Domain-specific Entity Recognition with Automatic Term Recognition and Feature Extraction
    Zhang, Ziqi
    Iria, Jose
    Ciravegna, Fabio
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 2606 - 2613
  • [3] Methods for automatic term recognition in domain-specific text collections: A survey
    Astrakhantsev, N. A.
    Fedorenko, D. G.
    Turdakov, D. Yu.
    PROGRAMMING AND COMPUTER SOFTWARE, 2015, 41 (06) : 336 - 349
  • [4] Methods for automatic term recognition in domain-specific text collections: A survey
    N. A. Astrakhantsev
    D. G. Fedorenko
    D. Yu. Turdakov
    Programming and Computer Software, 2015, 41 : 336 - 349
  • [5] Automatic expansion of domain-specific lexicons by term categorization
    Consiglio Nazionale delle Ricerche, Italy
    不详
    不详
    不详
    ACM Trans. Speech Lang. Process., 2006, 1 (1-30):
  • [6] DOMAIN-SPECIFIC AUTOMATIC PROGRAMMING
    BARSTOW, DR
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1985, 11 (11) : 1321 - 1336
  • [7] Domain-Specific Term Rankings Using Topic Models
    Liu, Zhiyuan
    Sun, Maosong
    INFORMATION RETRIEVAL TECHNOLOGY, 2010, 6458 : 454 - 465
  • [8] DOMAIN-SPECIFIC AUTOMATIC PROGRAMMING.
    Barstow, David R.
    IEEE Transactions on Software Engineering, 1985, SE-11 (11) : 1321 - 1336
  • [9] Automatic domain-specific term extraction and its application in text classification
    Liu, Tao
    Liu, Bing-Quan
    Xu, Zhi-Ming
    Wang, Xiao-Long
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2007, 35 (02): : 328 - 332
  • [10] A Domain-Specific Language for Automatic Generation of Checkers
    Hadiwijaya, Ryan Ignatius
    Liem, M. M. Inggriani
    2015 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2015, : 7 - 12