GeoNER:Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training

被引：0

作者：

MA Kai ^{[1
,2
]}

HU Xinxin ^{[1
,2
]}

TIAN Miao ^{[3
]}

TAN Yongjian ^{[1
,2
]}

ZHENG Shuai ^{[1
,2
]}

TAO Liufeng ^{[3
,4
,5
]}

QIU Qinjun ^{[3
,4
,5
]}

机构：

[1] Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University

[2] College of Computer and Information Technology, China Three Gorges University

[3] Key Laboratory of Geological Survey and Evaluation of Ministry of Education, China University of Geosciences

[4] School of Computer Science, China University of Geosciences

[5] Key Laboratory of Quantitative Resource Evaluation and Information Engineering, Ministry of Natural Resources, China University of

来源：

Acta Geologica Sinica(English Edition) | 2024年 / 98卷 / 05期

关键词：

D O I：

暂无

中图分类号：

TP391.1 [文字信息处理]; P628 [数学勘探];

学科分类号：

摘要：

As important geological data, a geological report contains rich expert and geological knowledge, but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge. While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents, their effectiveness is hampered by a dearth of domain-specific knowledge, which in turn leads to a pronounced decline in recognition accuracy. This study summarizes six types of typical geological entities, with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER). In addition, Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer) is proposed to address the issues of ambiguity, diversity and nested entities for the geological entities. The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT) and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory), followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference, the decoding finally being performed using a global association pointer algorithm. The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.

引用

页码：1404 / 1417

页数：14

共 50 条

[41] Generation of training data for named entity recognition of artworks
Jain, Nitisha
Sierra-Munera, Alejandro
Ehmueller, Jan
Krestel, Ralf
SEMANTIC WEB, 2023, 14 (02) : 239 - 260
[42] SELF-TRAINING AND PRE-TRAINING ARE COMPLEMENTARY FOR SPEECH RECOGNITION
Xu, Qiantong
Baevski, Alexei
Likhomanenko, Tatiana
Tomasello, Paden
Conneau, Alexis
Collobert, Ronan
Synnaeve, Gabriel
Auli, Michael
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3030 - 3034
[43] CycleNER: An Unsupervised Training Approach for Named Entity Recognition
Iovine, Andrea
Fang, Anjie
Fetahu, Besnik
Rokhlenko, Oleg
Malmasi, Shervin
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 2916 - 2924
[44] Named Entity Recognition Model of Traditional Chinese Medicine Medical Texts based on Contextual Semantic Enhancement and Adversarial Training
Ma, Yuekun
Wen, Moyan
Liu, He
IAENG International Journal of Computer Science, 2024, 51 (08) : 1137 - 1143
[45] Entity Enhanced BERT Pre-training for Chinese NER
Jia, Chen
Shi, Yuefeng
Yang, Qinrong
Zhang, Yue
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6384 - 6396
[46] Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks
Dong, Haoyu
Cheng, Zhoujun
He, Xinyi
Zhou, Mengyu
Zhou, Anda
Zhou, Fan
Liu, Ao
Han, Shi
Zhang, Dongmei
PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 5426 - 5435
[47] In Defense of Image Pre-Training for Spatiotemporal Recognition
Li, Xianhang
Wang, Huiyu
Wei, Chen
Mei, Jieru
Yuille, Alan
Zhou, Yuyin
Xie, Cihang
COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 675 - 691
[48] Chinese Named Entity Recognition for Automobile Fault Texts Based on External Context Retrieving and Adversarial Training
Wang, Shuhai
Sun, Linfu
ENTROPY, 2025, 27 (02)
[49] Chinese Named Entity Recognition Method Combining ALBERT and a Local Adversarial Training and Adding Attention Mechanism
Zhang Runmei
Li Lulu
Yin Lei
Liu Jingjing
Xu Weiyi
Cao Weiwei
Chen Zhong
INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2022, 18 (01)
[50] Extract-Select: A Span Selection Framework for Nested Named Entity Recognition with Generative Adversarial Training
Huang, Peixin
Zhao, Xiang
Hu, Minghao
Fang, Yang
Li, Xinyi
Xiao, Weidong
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 85 - 96

← 1 2 3 4 5 →