TLATR: Automatic Topic Labeling Using Automatic (Domain-Specific) Term Recognition

被引:5
|
作者
Truica, Ciprian-Octavian [1 ]
Apostol, Elena-Simona [1 ]
机构
[1] Univ Politehn Bucuresti, Fac Automat Control & Comp, Dept Comp Sci & Engn, Bucharest 060042, Romania
来源
IEEE ACCESS | 2021年 / 9卷
关键词
Labeling; Task analysis; Mutual information; Semantics; Indexes; Computational modeling; Bit error rate; Automatic term recognition; automatic topic labeling evaluation; topic labeling; topic modeling; INFORMATION-RETRIEVAL; PROBABILISTIC MODEL;
D O I
10.1109/ACCESS.2021.3083000
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Topic modeling is a probabilistic graphical model for discovering latent topics in text corpora by using multinomial distributions of topics over words. Topic labeling is used to assign meaningful labels for the discovered topics. In this paper, we present a new topic labeling method that uses automatic term recognition to discover and assign relevant labels for each topic, i.e., TLATR (Topic Labeling using Automatic Term Recognition). TLATR uses domain-specific multi-terms that appear in the set of documents belonging to a topic. The multi-term having the highest score as determined by the automatic term recognition algorithm is chosen as the label for that topic. To evaluate TLATR, we use two real, publicly available datasets that contain scientific articles' abstracts. The topic label evaluation is done both automatically and using human annotators. For the automatic evaluation, we use Pointwise Mutual Information, Normalized Pointwise Mutual Information, and document similarity. For human evaluation, we employ the average rating method. Furthermore, we also evaluate the quality of the topic models using the Adjusted Rand Index. To prove that our novel method extracts relevant topic labels, we compare TLATR with two state-of-the-art methods, one supervised and one unsupervised, provided by the NETL Automatic Topic Labelling system. The experimental results show that our method outperforms or provides similar results with both NETL's supervised and unsupervised approaches.
引用
收藏
页码:76624 / 76641
页数:18
相关论文
共 50 条
  • [31] Towards Domain-specific Model Editors with Automatic Model Completion
    Sen, Sagar
    Baudry, Benoit
    Vangheluwe, Hans
    SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL, 2010, 86 (02): : 109 - 126
  • [32] DEXTER: Automatic Extraction of Domain-Specific Glossaries for Language Teaching
    Perinan-Pascual, Carlos
    Mestre-Mestre, Eva M.
    CURRENT WORK IN CORPUS LINGUISTICS: WORKING WITH TRADITIONALLY- CONCEIVED CORPORA AND BEYOND (CILC2015), 2015, 198 : 377 - 385
  • [33] ParAgent: A domain-specific semi-automatic parallelization tool
    Mitra, S
    Kothari, SC
    Cho, J
    Krishnaswamy, A
    HIGH PERFORMANCE COMPUTING - HIPC 2000, PROCEEDINGS, 2001, 1970 : 141 - 148
  • [34] Using UML as a Domain-Specific Modeling Language: A Proposal for Automatic Generation of UML Profiles
    Giachetti, Giovanni
    Marin, Beatriz
    Pastor, Oscar
    ADVANCED INFORMATION SYSTEMS ENGINEERING, PROCEEDINGS, 2009, 5565 : 110 - 124
  • [35] Automatic topic segmentation and labeling in multiparty dialogue
    Hsueh, Pei-Yun
    Moore, Johanna D.
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 98 - +
  • [36] Automatic Design of Domain-Specific Instructions for Low-Power Processors
    Gonzalez-Alvarez, Cecilia
    Sartor, Jennifer B.
    Alvarez, Carlos
    Jimenez-Gonzalez, Daniel
    Eeckhout, Lieven
    PROCEEDINGS OF THE ASAP2015 2015 IEEE 26TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, 2015, : 1 - 8
  • [37] Automatic generation of Truffle-based interpreters for Domain-Specific Languages
    Leduc, Manuel
    Jouneaux, Gwendal
    Degueule, Thomas
    Le Guernic, Gurvan
    Barais, Olivier
    Combemale, Benoit
    JOURNAL OF OBJECT TECHNOLOGY, 2020, 19 (02):
  • [38] Automatic Summarization of Domain-specific Forum Threads: Collecting Reference Data
    Verberne, Suzan
    van den Bosch, Antal
    Wubben, Sander
    Krahmer, Emiel
    CHIIR'17: PROCEEDINGS OF THE 2017 CONFERENCE HUMAN INFORMATION INTERACTION AND RETRIEVAL, 2017, : 253 - 256
  • [39] An Automatic Approach for Discovering and Geocoding Locations in Domain-Specific Web Data
    Mattmann, Chris A.
    Sharan, Madhav
    PROCEEDINGS OF 2016 IEEE 17TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI), 2016, : 87 - 93
  • [40] Automatic Construction of Domain-specific Sentiment Lexicon Based on the Semantics Graph
    Xiong, Gen
    Fang, Yilin
    Liu, Quan
    2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2017,