GeoNER:Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training

被引:0
|
作者
MA Kai [1 ,2 ]
HU Xinxin [1 ,2 ]
TIAN Miao [3 ]
TAN Yongjian [1 ,2 ]
ZHENG Shuai [1 ,2 ]
TAO Liufeng [3 ,4 ,5 ]
QIU Qinjun [3 ,4 ,5 ]
机构
[1] Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University
[2] College of Computer and Information Technology, China Three Gorges University
[3] Key Laboratory of Geological Survey and Evaluation of Ministry of Education, China University of Geosciences
[4] School of Computer Science, China University of Geosciences
[5] Key Laboratory of Quantitative Resource Evaluation and Information Engineering, Ministry of Natural Resources, China University of
关键词
D O I
暂无
中图分类号
TP391.1 [文字信息处理]; P628 [数学勘探];
学科分类号
摘要
As important geological data, a geological report contains rich expert and geological knowledge, but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge. While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents, their effectiveness is hampered by a dearth of domain-specific knowledge, which in turn leads to a pronounced decline in recognition accuracy. This study summarizes six types of typical geological entities, with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER). In addition, Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer) is proposed to address the issues of ambiguity, diversity and nested entities for the geological entities. The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT) and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory), followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference, the decoding finally being performed using a global association pointer algorithm. The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.
引用
收藏
页码:1404 / 1417
页数:14
相关论文
共 50 条
  • [21] Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation
    Ji, Zizheng
    Dai, Lin
    Pang, Jin
    Shen, Tingting
    IEEE ACCESS, 2020, 8 : 100469 - 100484
  • [22] YNUNLP at SemEval-2023 Task 2:The Pseudo Twin Tower Pre-training Model for Chinese Named Entity Recognition
    Li, Jing
    Zhou, Xiaobing
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1619 - 1624
  • [23] Janko at SemEval-2023 Task 2: Bidirectional LSTM Model Based on Pre-training for Chinese Named Entity Recognition
    Li, Jiankuo
    Guan, Zhengyi
    Ding, Haiyan
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 958 - 962
  • [24] EntityLayout: Entity-Level Pre-training Language Model for Semantic Entity Recognition and Relation Extraction
    Xu, Chun-Bo
    Chen, Yi-Ming
    Liu, Cheng-Lin
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT I, 2024, 14804 : 262 - 279
  • [25] Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training
    Wang, Hao
    Zhou, Lekai
    Duan, Jianyong
    He, Li
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [26] Chinese medical named entity recognition integrating adversarial training and feature enhancement
    Xu Zhang
    Youchen Kao
    Shengbing Che
    Juan Yan
    Sha Zhou
    Shenyi Guo
    Wanqin Wang
    Scientific Reports, 15 (1)
  • [27] Robust Chinese Clinical Named Entity Recognition with information bottleneck and adversarial training
    He, Yunfei
    Zhang, Zhiqiang
    Shen, Jinlong
    Li, Yuling
    Zhang, Yiwen
    Ding, Weiping
    Yang, Fei
    APPLIED SOFT COMPUTING, 2024, 167
  • [28] Adversarial Training Lattice LSTM for Named Entity Recognition of Rail Fault Texts
    Su, Shuai
    Qu, Jia
    Cao, Yuan
    Li, Ruoqing
    Wang, Guang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) : 21201 - 21215
  • [29] Study on Chinese Named Entity Recognition Based on Dynamic Fusion and Adversarial Training
    Fan, Fei
    Yang, Linnan
    Wu, Xingyu
    Lin, Shengken
    Dong, Huijie
    Yin, Changshan
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2022, PT III, 2022, 13370 : 3 - 14
  • [30] Adversarial training based lattice LSTM for Chinese clinical named entity recognition
    Zhao, Shan
    Cai, Zhiping
    Chen, Haiwen
    Wang, Ye
    Liu, Fang
    Liu, Anfeng
    JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 99