GeoNER:Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training

被引:0
|
作者
MA Kai [1 ,2 ]
HU Xinxin [1 ,2 ]
TIAN Miao [3 ]
TAN Yongjian [1 ,2 ]
ZHENG Shuai [1 ,2 ]
TAO Liufeng [3 ,4 ,5 ]
QIU Qinjun [3 ,4 ,5 ]
机构
[1] Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University
[2] College of Computer and Information Technology, China Three Gorges University
[3] Key Laboratory of Geological Survey and Evaluation of Ministry of Education, China University of Geosciences
[4] School of Computer Science, China University of Geosciences
[5] Key Laboratory of Quantitative Resource Evaluation and Information Engineering, Ministry of Natural Resources, China University of
关键词
D O I
暂无
中图分类号
TP391.1 [文字信息处理]; P628 [数学勘探];
学科分类号
摘要
As important geological data, a geological report contains rich expert and geological knowledge, but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge. While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents, their effectiveness is hampered by a dearth of domain-specific knowledge, which in turn leads to a pronounced decline in recognition accuracy. This study summarizes six types of typical geological entities, with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER). In addition, Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer) is proposed to address the issues of ambiguity, diversity and nested entities for the geological entities. The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT) and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory), followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference, the decoding finally being performed using a global association pointer algorithm. The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.
引用
收藏
页码:1404 / 1417
页数:14
相关论文
共 50 条
  • [41] Generation of training data for named entity recognition of artworks
    Jain, Nitisha
    Sierra-Munera, Alejandro
    Ehmueller, Jan
    Krestel, Ralf
    SEMANTIC WEB, 2023, 14 (02) : 239 - 260
  • [42] SELF-TRAINING AND PRE-TRAINING ARE COMPLEMENTARY FOR SPEECH RECOGNITION
    Xu, Qiantong
    Baevski, Alexei
    Likhomanenko, Tatiana
    Tomasello, Paden
    Conneau, Alexis
    Collobert, Ronan
    Synnaeve, Gabriel
    Auli, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3030 - 3034
  • [43] CycleNER: An Unsupervised Training Approach for Named Entity Recognition
    Iovine, Andrea
    Fang, Anjie
    Fetahu, Besnik
    Rokhlenko, Oleg
    Malmasi, Shervin
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 2916 - 2924
  • [44] Named Entity Recognition Model of Traditional Chinese Medicine Medical Texts based on Contextual Semantic Enhancement and Adversarial Training
    Ma, Yuekun
    Wen, Moyan
    Liu, He
    IAENG International Journal of Computer Science, 2024, 51 (08) : 1137 - 1143
  • [45] Entity Enhanced BERT Pre-training for Chinese NER
    Jia, Chen
    Shi, Yuefeng
    Yang, Qinrong
    Zhang, Yue
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6384 - 6396
  • [46] Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks
    Dong, Haoyu
    Cheng, Zhoujun
    He, Xinyi
    Zhou, Mengyu
    Zhou, Anda
    Zhou, Fan
    Liu, Ao
    Han, Shi
    Zhang, Dongmei
    PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 5426 - 5435
  • [47] In Defense of Image Pre-Training for Spatiotemporal Recognition
    Li, Xianhang
    Wang, Huiyu
    Wei, Chen
    Mei, Jieru
    Yuille, Alan
    Zhou, Yuyin
    Xie, Cihang
    COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 675 - 691
  • [48] Chinese Named Entity Recognition for Automobile Fault Texts Based on External Context Retrieving and Adversarial Training
    Wang, Shuhai
    Sun, Linfu
    ENTROPY, 2025, 27 (02)
  • [49] Chinese Named Entity Recognition Method Combining ALBERT and a Local Adversarial Training and Adding Attention Mechanism
    Zhang Runmei
    Li Lulu
    Yin Lei
    Liu Jingjing
    Xu Weiyi
    Cao Weiwei
    Chen Zhong
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2022, 18 (01)
  • [50] Extract-Select: A Span Selection Framework for Nested Named Entity Recognition with Generative Adversarial Training
    Huang, Peixin
    Zhao, Xiang
    Hu, Minghao
    Fang, Yang
    Li, Xinyi
    Xiao, Weidong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 85 - 96