Bootstrapping lexical knowledge from unsegmented text using graph kernels

被引:0
|
作者
Hagiwara M. [1 ]
Ogawa Y. [2 ]
Toyama K. [2 ]
机构
[1] Graduate School of Information Science, Nagoya University
关键词
Bootstrapping; Graph kernel; Link analysis; Named entity extraction; Semantic category; Unsegmented text;
D O I
10.1527/tjsai.26.440
中图分类号
学科分类号
摘要
Extraction of named entitiy classes and their relationships from large corpora often involves morphological analysis of target sentences and tends to suffer from out-of-vocabulary words. In this paper we propose a semantic category extraction algorithm called Monaka and its graph-based extention g-Monaka, both of which use character n-gram based patterns as context to directly extract semantically related instances from unsegmented Japanese text. These algorithms also use "bidirectional adjacent constraints," which states that reliable instances should be placed in between reliable left and right context patterns, in order to improve proper segmentation. Monaka algorithms uses iterative induction of instaces and pattens similarly to the bootstrapping algorithm Espresso. The g-Monaka algorithm further formalizes the adjacency relation of character n-grams as a directed graph and applies von Neumann kernel and Laplacian kernel so that the negative effect of semantic draft, i.e., a phenomenon of semantically unrelated general instances being extracted, is reduced. The experiments show that g-Monaka substantially increases the performance of semantic category acquisition compared to conventional methods, including distributional similarity, bootstrapping-based Espresso, and its graph-based extension g-Espresso, in terms of F-value of the NE category task from unsegmented Japanese newspaper articles.
引用
收藏
页码:440 / 450
页数:10
相关论文
共 50 条
  • [21] Automatic Text Document Summarization Using Graph Based Centrality Measures on Lexical Network
    Yadav, Chandra Shakhar
    Sharan, Aditi
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2018, 8 (03) : 14 - 32
  • [22] A Framework to Construct Financial Causality Knowledge Graph from Text
    Xu, Ziwei
    Takamura, Hiroya
    Ichise, Ryutaro
    18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 57 - 64
  • [23] Preface for the International Workshop on Knowledge Graph Generation from Text
    Tiwari, Sanju
    Mihindukulasooriya, Nandana
    Osborne, Francesco
    Kontokostas, Dimitris
    D’Souza, Jennifer
    Kejriwal, Mayank
    CEUR Workshop Proceedings, 2022, 3184
  • [24] Extracting triples from Vietnamese text to create knowledge graph
    Huong Duong To
    Phuc Do
    2020 12TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (IEEE KSE 2020), 2020, : 219 - 223
  • [25] Semantic role labeling for knowledge graph extraction from text
    Mehwish Alam
    Aldo Gangemi
    Valentina Presutti
    Diego Reforgiato Recupero
    Progress in Artificial Intelligence, 2021, 10 : 309 - 320
  • [26] Semantic role labeling for knowledge graph extraction from text
    Alam, Mehwish
    Gangemi, Aldo
    Presutti, Valentina
    Reforgiato Recupero, Diego
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2021, 10 (03) : 309 - 320
  • [27] Using text mining to establish knowledge graph from accident/incident reports in risk assessment
    Liu, Chang
    Yang, Shiwu
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
  • [28] Mining Protein Interactions from Text Using Convolution Kernels
    Narayanan, Ramanathan
    Misra, Sanchit
    Lin, Simon
    Choudhary, Alok
    NEW FRONTIERS IN APPLIED DATA MINING, 2010, 5669 : 118 - +
  • [29] Lexical semantics and knowledge representation in multilingual text generation
    Di Eugenio, B
    COMPUTATIONAL LINGUISTICS, 2000, 26 (02) : 270 - 273
  • [30] Bootstrapping text recognition from stop words
    Ho, TK
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 605 - 609