Knowledge Discovery with CRF-Based Clustering of Named Entities without a Priori Classes

被引:0
|
作者
Claveau, Vincent [1 ,2 ]
Ncibi, Abir [1 ,2 ]
机构
[1] IRISA CNRS, Campus Beaulieu, F-35042 Rennes, France
[2] INRIA IRISA, F-35042 Rennes, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge discovery aims at bringing out coherent groups of entities. It is usually based on clustering which necessitates defining a notion of similarity between the relevant entities. In this paper, we propose to divert a supervised machine learning technique (namely Conditional Random Fields, widely used for supervised labeling tasks) in order to calculate, indirectly and without supervision, similarities among text sequences. Our approach consists in generating artificial labeling problems on the data to reveal regularities between entities through their labeling. We describe how this framework can be implemented and experiment it on two information extraction/discovery tasks. The results demonstrate the usefulness of this unsupervised approach, and open many avenues for defining similarities for complex representations of textual data.
引用
收藏
页码:415 / 428
页数:14
相关论文
共 50 条
  • [1] Combining Knowledge and CRF-Based Approach to Named Entity Recognition in Russian
    Mozharova, V. A.
    Loukachevitch, N. V.
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, AIST 2016, 2017, 661 : 185 - 195
  • [2] CRF-Based Named Entity Recognition for Myanmar Language
    Mo, Hsu Myat
    Nwet, Khin Thandar
    Soe, Khin Mar
    GENETIC AND EVOLUTIONARY COMPUTING, 2017, 536 : 204 - 211
  • [3] CRF-based Active Learning for Chinese Named Entity Recognition
    Yao, Lin
    Sun, Chengjie
    Li, Shaofeng
    Wang, Xiaolong
    Wang, Xuan
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 1557 - +
  • [4] LTP: A New Active Learning Strategy for CRF-Based Named Entity Recognition
    Liu, Mingyi
    Tu, Zhiying
    Zhang, Tong
    Su, Tonghua
    Xu, Xiaofei
    Wang, Zhongjie
    NEURAL PROCESSING LETTERS, 2022, 54 (03) : 2433 - 2454
  • [5] CRF-Based Czech Named Entity Recognizer and Consolidation of Czech NER Research
    Konkol, Michal
    Konopik, Miloslav
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 153 - 160
  • [6] LTP: A New Active Learning Strategy for CRF-Based Named Entity Recognition
    Mingyi Liu
    Zhiying Tu
    Tong Zhang
    Tonghua Su
    Xiaofei Xu
    Zhongjie Wang
    Neural Processing Letters, 2022, 54 : 2433 - 2454
  • [7] A CRF-Based Stacking Model with Meta-features for Named Entity Recognition
    Liu, Shifeng
    Sun, Yifang
    Wang, Wei
    Zhou, Xiaoling
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT II, 2018, 10938 : 54 - 66
  • [8] Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities
    Montalvo, Soto
    Martinez, Raquel
    Casillas, Arantza
    Fresno, Victor
    COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 1145 - 1152
  • [9] Extending a CRF-based named entity recognition model for Turkish well formed text and user generated content
    Seker, Gokhan Akin
    Eryigit, Gulsen
    SEMANTIC WEB, 2017, 8 (05) : 625 - 642
  • [10] CRF-Based Clustering of Pharmacokinetic Curves from Dynamic Contrast-Enhanced MR Images
    Jurek, Jakub
    Pelesz, Mateusz
    Wojciechowski, Andrzej
    Klepaczko, Artur
    Kocinski, Marek
    Materka, Andrzej
    Losnegard, Are
    Reisaeter, Lars
    Halvorsen, Ole J.
    Beisland, Christian
    Rorvik, Jarle
    Lundervold, Arvid
    2018 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2018, : 174 - 179