Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence

被引:9
|
作者
Liu, Peipei [1 ,2 ]
Li, Hong [1 ,2 ]
Wang, Zuoguang [1 ,2 ]
Liu, Jie [1 ,2 ]
Ren, Yimo [1 ,2 ]
Zhu, Hongsong [1 ,2 ]
机构
[1] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
cybersecurity; named entity recognition; multi-features; semantic augmentation; attention mechanism;
D O I
10.1109/ICPR56361.2022.9956373
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extracting cybersecurity entities such as attackers and vulnerabilities from unstructured network texts is an important part of security analysis. However, the sparsity of intelligence data resulted from the higher frequency variations and the randomness of cybersecurity entity names makes it difficult for current methods to perform well in extracting security-related concepts and entities. To this end, we propose a semantic augmentation method which incorporates different linguistic features to enrich the representation of input tokens to detect and classify the cybersecurity names over unstructured text. In particular, we encode and aggregate the constituent feature, morphological feature and part of speech feature for each input token to improve the robustness of the method. More than that, a token gets augmented semantic information from its most similar K words in cybersecurity domain corpus where an attentive module is leveraged to weigh differences of the words, and from contextual clues based on a large-scale general field corpus. We have conducted experiments on the cybersecurity datasets DNRTI and MalwareTextDB, and the results demonstrate the effectiveness of the proposed method.
引用
收藏
页码:1557 / 1563
页数:7
相关论文
共 50 条
  • [21] AERNs: Attention-Based Entity Region Networks for Multi-Grained Named Entity Recognition
    Dai, Jianghai
    Feng, Chong
    Bai, Xuefeng
    Dai, Jinming
    Zhang, Huanhuan
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 408 - 415
  • [22] ALDANER: Active Learning based Data Augmentation for Named Entity Recognition
    Moscato, Vincenzo
    Postiglione, Marco
    Sperli, Giancarlo
    Vignali, Andrea
    KNOWLEDGE-BASED SYSTEMS, 2024, 305
  • [23] Chinese Medical Named Entity Recognition Based on Fusion of Global Features and Multi-Local Features
    Sun, Huarong
    Wang, Jianfeng
    Li, Bo
    Cao, Xiyuan
    Zang, Junbin
    Xue, Chenyang
    Zhang, Zhidong
    IEEE ACCESS, 2023, 11 : 137506 - 137520
  • [24] Multi-level context features extraction for named entity recognition
    Chang, Jun
    Han, Xiaohong
    COMPUTER SPEECH AND LANGUAGE, 2023, 77
  • [25] Chinese named entity recognition based on multilevel linguistic features
    Guo, HL
    Jiang, JM
    Hu, G
    Zhang, T
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 90 - 99
  • [26] Finger Vein Recognition Based on Multi-Features Fusion
    Titrek, Fatih
    Baykan, oemer K.
    TRAITEMENT DU SIGNAL, 2023, 40 (01) : 101 - 113
  • [27] Chinese mineral exploration named entity recognition for literature mining by fusing multi-features with an enhancement domain pre-training model
    Wu, Qirui
    Liu, Zhihao
    Miao, Tian
    Qiu, Qinjun
    Tao, Liufeng
    Chen, Jianguo
    Xie, Zhong
    ORE GEOLOGY REVIEWS, 2025, 176
  • [28] Semantic Role Labeling based on dependency Tree with multi-features
    Shi, Hanxiao
    Zhou, Guodong
    Qian, Peide
    Li, Xiaojun
    2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 584 - +
  • [29] Correction to: Novel data augmentation for named entity recognition
    Aluru V. N. M. Hemateja
    Gopikrishnan Kondakath
    Susruta Das
    Mohanaprasad Kothandaraman
    S. Shoba
    Abhishek Pandey
    Rajin Babu
    Abhinav Jain
    International Journal of Speech Technology, 2023, 26 (4) : 879 - 879
  • [30] Data Augmentation for Chinese Clinical Named Entity Recognition
    Wang P.-H.
    Li M.-Z.
    Li S.
    Li, Si (lisi@bupt.edu.cn), 1600, Beijing University of Posts and Telecommunications (43): : 84 - 90