Recurrent Neural Network-Based Model for Named Entity Recognition with Improved Word Embeddings

被引:2
|
作者
Goyal, Archana [1 ]
Gupta, Vishal [2 ]
Kumar, Manish [3 ]
机构
[1] Goswami Ganesh Dutta Sanatan Dharma Coll, PG Dept Informat Technol, Chandigarh 160030, India
[2] Panjab Univ, Univ Inst Engn & Technol, Chandigarh 160014, India
[3] Panjab Univ Reg Ctr, Comp Sci & Applicat, Muktsar, Punjab, India
关键词
Bidirectional long short-term memory (Bi-LSTM); convolutional neural network (CNN); named entity recognition (NER); recurrent neural network (RNN); word embeddings;
D O I
10.1080/03772063.2021.2006805
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Extraction of meaningful information from a huge amount of data available on the web is a quite challenging task. The challenges faced in information extraction can be overcome with the help of an efficient named entity recognition (NER) system. Named entities are the proper names that play an important role in searching important information of interest. In this study, an efficient deep learning-based NER technique has been proposed which recognizes the named entities belonging to the general domain from Hindi, Punjabi, and bilingual Hindi and Punjabi text. An important variant of recurrent neural network, namely bidirectional long short-term memory-based model using improved word embeddings has been developed. Improved word embeddings are the combination of character convolutional neural network embeddings and part of speech embeddings. The main findings of the study include the development of a NER system that can extract named entities not only from Hindi and Punjabi datasets individually but also from mixed Hindi and Punjabi text. Besides, improved word embeddings are the combination of character-level features and word-level features which we find as the novel work as per our knowledge. Improved word embeddings are found to be effective in achieving better results than the results obtained by earlier NER models with deep feature extraction tasks.
引用
收藏
页码:6970 / 6976
页数:7
相关论文
共 50 条
  • [1] Deep recurrent neural networks with word embeddings for Urdu named entity recognition
    Khan, Wahab
    Daud, Ali
    Alotaibi, Fahd
    Aljohani, Naif
    Arafat, Sachi
    ETRI JOURNAL, 2020, 42 (01) : 90 - 100
  • [2] A deep neural network-based model for named entity recognition for Hindi language
    Richa Sharma
    Sudha Morwal
    Basant Agarwal
    Ramesh Chandra
    Mohammad S. Khan
    Neural Computing and Applications, 2020, 32 : 16191 - 16203
  • [3] A deep neural network-based model for named entity recognition for Hindi language
    Sharma, Richa
    Morwal, Sudha
    Agarwal, Basant
    Chandra, Ramesh
    Khan, Mohammad S.
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (20): : 16191 - 16203
  • [4] A deep neural framework for named entity recognition with boosted word embeddings
    Goyal, Archana
    Gupta, Vishal
    Kumar, Manish
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) : 15533 - 15546
  • [5] A deep neural framework for named entity recognition with boosted word embeddings
    Archana Goyal
    Vishal Gupta
    Manish Kumar
    Multimedia Tools and Applications, 2024, 83 : 15533 - 15546
  • [6] Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition
    Unanue, Inigo Jauregi
    Borzeshi, Ehsan Zare
    Piccardi, Massimo
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 76 : 102 - 109
  • [7] DISCRIMINATIVE ACOUSTIC WORD EMBEDDINGS: RECURRENT NEURAL NETWORK-BASED APPROACHES
    Settle, Shane
    Livescu, Karen
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 503 - 510
  • [8] Named Entity Recognition Only from Word Embeddings
    Luo, Ying
    Zhao, Hai
    Zhan, Junlang
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8995 - 9005
  • [9] Combining Word Embeddings for Portuguese Named Entity Recognition
    da Silva, Messias Gomes
    Alves de Oliveira, Hilario Tomaz
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2022, 2022, 13208 : 198 - 208
  • [10] Terminologies augmented recurrent neural network model for clinical named entity recognition
    Lerner, Ivan
    Paris, Nicolas
    Tannier, Xavier
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 102