Named Entity Recognition in Chinese Rice Breeding Questions Based on Text Data Augmentation

被引:0
|
作者
Niu, Peiyu [1 ]
Hou, Chen [2 ,3 ]
机构
[1] College of Information and Electrical Engineering, China Agricultural University, Beijing,100083, China
[2] National Engineering Laboratory for Big Data Analysis and Applications, Peking University, Beijing,100871, China
[3] PKU-Changsha Institute for Computing and Digital Economy, Changsha,410205, China
关键词
Data reduction - Labeled data - Metadata - Question answering;
D O I
10.6041/j.issn.1000-1298.2024.08.030
中图分类号
学科分类号
摘要
Issues of low-level data management and high knowledge granularity exist in current rice breeding question answering systems. In addition, there is a lack of publicly available labeled data for named entity recognition in rice breeding, and manual annotation can be costly. To address these issues, an approach based on text data augmentation to the named entity recognition was proposed for rice breeding questions. The rice breeding knowledge graph was created to assist in subdividing larger named entity categories in rice breeding, such as rice characteristics entities, into smaller subcategories, such as resistance to abiotic stress and eating quality. It helped to enhance entity boundaries and reduce knowledge granularity. Responding to the challenge of high annotation costs for rice breeding data that results in suboptimal performance in named entity recognition, the DA — BERT — BILSTM — CRF model was presented by introducing a data augmentation layer into the BERT — BILSTM — CRF model. Using manually labeled rice breeding questions as training data, the proposed model was compared with three other baseline models. In the overall named entity recognition experiment under the small class entity division, the model achieved a precision of 93. 86%, a recall of 92. 82%, and an Fl score of 93. 34% . Compared with the best-performing BERT — BILSTM — CRF model among the three baseline models, the model outperformed by 4.98, 5.3 and 5. 15 percentages points, respectively. Meanwhile, it also performed better in the single-entity recognition metric, achieving a precision of 94. 26% and an Fl score of 93. 32% . The experiments showed that the proposed approach performed better in both overall named entity recognition and single-class named entity recognition tasks in rice breeding questions. © 2024 Chinese Society of Agricultural Machinery. All rights reserved.
引用
收藏
页码:333 / 343
相关论文
共 50 条
  • [1] Entity-to-Text based Data Augmentation for various Named Entity Recognition Tasks
    Hu, Xuming
    Jiang, Yong
    Liu, Aiwei
    Huang, Zhongqiang
    Xie, Pengjun
    Huang, Fei
    Wen, Lijie
    Yu, Philip S.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9072 - 9087
  • [2] Data Augmentation for Chinese Clinical Named Entity Recognition
    Wang P.-H.
    Li M.-Z.
    Li S.
    Li, Si (lisi@bupt.edu.cn), 1600, Beijing University of Posts and Telecommunications (43): : 84 - 90
  • [3] Label-Guided Data Augmentation for Chinese Named Entity Recognition
    Jiang, Miao
    Chen, Honghui
    APPLIED SCIENCES-BASEL, 2025, 15 (05):
  • [4] Named Entity Recognition of Chinese Text Based on Attention Mechanism
    Shen, Tong-Ping
    Dumlao, Menchita
    Meng, Qing-Quan
    Zhan, Zhong-Hua
    Journal of Network Intelligence, 2023, 8 (02): : 505 - 518
  • [5] Product named entity recognition in Chinese text
    Jun Zhao
    Feifan Liu
    Language Resources and Evaluation, 2008, 42 : 197 - 217
  • [6] Product named entity recognition in Chinese text
    Zhao, Jun
    Liu, Feifan
    LANGUAGE RESOURCES AND EVALUATION, 2008, 42 (02) : 197 - 217
  • [7] Novel data augmentation for named entity recognition
    Hemateja A.V.N.M.
    Kondakath G.
    Das S.
    Kothandaraman M.
    Shoba S.
    Pandey A.
    Babu R.
    Jain A.
    International Journal of Speech Technology, 2023, 26 (4) : 869 - 878
  • [8] A Framework of Data Augmentation While Active Learning for Chinese Named Entity Recognition
    Li, Qingqing
    Huang, Zhen
    Dou, Yong
    Zhang, Ziwen
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2021, PT II, 2021, 12816 : 88 - 100
  • [9] Named Entity Recognition of Chinese Agricultural Text Based on Attention Mechanism
    Zhao, Pengfei
    Zhao, Chunjiang
    Wu, Huarui
    Wang, Wei
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 (01): : 185 - 192
  • [10] ALDANER: Active Learning based Data Augmentation for Named Entity Recognition
    Moscato, Vincenzo
    Postiglione, Marco
    Sperli, Giancarlo
    Vignali, Andrea
    KNOWLEDGE-BASED SYSTEMS, 2024, 305