AGRONER: An unsupervised agriculture named entity recognition using weighted distributional semantic model

被引:15
|
作者
Veena, G. [1 ]
Kanjirangat, Vani [2 ]
Gupta, Deepa [3 ]
机构
[1] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Dept Comp Sci & Applicat, Amritapuri, India
[2] Ist Dalle Molle Studi Intelligenza Artificiale USI, Lugano, Switzerland
[3] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Dept Comp Sci & Engn, Bengaluru, India
关键词
Unsupervised approach; Named entity recognition; BERT; Agriculture; Topic modeling; LDA; EXTRACTION; LEVEL;
D O I
10.1016/j.eswa.2023.120440
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a novel weighted distributional semantic model for unsupervised Named Entity Recognition (NER) in domain specific texts, specifically focusing on agricultural domain. Developing accurate agriculture NER models requires overcoming several challenges, including the lack of annotated data, domain-specific vocabulary, entity ambiguity, and contextual variation. The proposed approach is completely unsupervised and utilizes an extended BERT model with LDA topic modeling (exBERT _LDA+) for NER. The proposed Agricultural Named Entity Recognition (AGRONER) model, focuses on identifying six major entities, disease, soil, pathogen, pesticide, crops, and place. The existing four entities are recognized using the proposed algorithm while we utilize the AGROVOC dictionary for crops and Geocoding APIs for Place entities. Due to the absence of a benchmark dataset in the agriculture domain, we created a corpus of 30,000 sentences extracted from recognized agriculture sites. For the evaluation, we used a test corpus with 700 sentences that include 1690 entity names. The labeled entities were then manually checked to evaluate the prediction accuracy. The proposed approach presents a macro average F-measure of 80.43%, which is quite promising for an unsupervised domain specific entity labeling. We performed ablations studies, where the proposed model exhibited a relative percentage improvement of 31.56%, 26.11% F-measure when compared to BERT without LDA (BERT _LDA-) and extended BERT without LDA (exBERT _LDA-)models, respectively. Experimental results show the efficacy of the proposed approach in labeling the named entities in an unsupervised set-up for the agricultural domain. Further, the approach can be easily extended to recognize more domain-specific entities.1
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Unsupervised named entity recognition using syntactic and semantic contextual evidence
    Cucchiarelli, A
    Velardi, P
    COMPUTATIONAL LINGUISTICS, 2001, 27 (01) : 123 - 131
  • [2] Squibs and discussions: Unsupervised named entity recognition using syntactic and semantic contextual evidence
    Cucchiarelli, Alessandro
    Velardi, Paola
    Computational Linguistics, 2001, 27 (01) : 122 - 131
  • [3] Curatable Named-Entity Recognition Using Semantic Relations
    Hsu, Yi-Yu
    Kao, Hung-Yu
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (04) : 785 - 792
  • [4] Mining heart disease risk factors in clinical text with named entity recognition and distributional semantic models
    Urbain, Jay
    JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 : S143 - S149
  • [5] CycleNER: An Unsupervised Training Approach for Named Entity Recognition
    Iovine, Andrea
    Fang, Anjie
    Fetahu, Besnik
    Rokhlenko, Oleg
    Malmasi, Shervin
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 2916 - 2924
  • [6] Unsupervised Ranking of Knowledge Bases for Named Entity Recognition
    Mrabet, Yassine
    Kilicoglu, Halil
    Demner-Fushman, Dina
    ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 1248 - 1255
  • [7] Conditional Random Fields for Spanish Named Entity Recognition Using Unsupervised Features
    Copara, Jenny
    Ochoa, Jose
    Thorne, Camilo
    Glavas, Goran
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2016, 2016, 10022 : 175 - 186
  • [8] A Neural Model for Unsupervised Named Entity Classification
    St. Chifu, Emil
    Chifu, Viorica R.
    2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING CONTROL & AUTOMATION, VOLS 1 AND 2, 2008, : 1077 - 1082
  • [9] Named Entity Recognition Based on Span Semantic Enhancement
    Geng R.
    Chen Y.
    Tang R.
    Huang R.
    Qin Y.
    Dong B.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2022, 56 (07): : 118 - 126
  • [10] Semantic Crawling: an Approach based on Named Entity Recognition
    Di Pietro, Giulia
    Aliprandi, Carlo
    De Luca, Antonio E.
    Raffaelli, Matteo
    Soru, Tiziana
    2014 PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2014), 2014, : 695 - 699