Bilingual Character Representation for Efficiently Addressing Out-of-Vocabulary Words in Code-Switching Named Entity Recognition

被引:0
|
作者
Winata, Genta Indra [1 ]
Wu, Chien-Sheng [1 ]
Madotto, Andrea [1 ]
Fung, Pascale [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Ctr Artificial Intelligence Res CAiRE, Dept Elect & Comp Engn, Clear Water Bay, Hong Kong, Peoples R China
来源
COMPUTATIONAL APPROACHES TO LINGUISTIC CODE-SWITCHING | 2018年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an LSTM-based model with hierarchical architecture on named entity recognition from code-switching Twitter data. Our model uses bilingual character representation and transfer learning to address out-of-vocabulary words. In order to mitigate data noise, we propose to use token replacement and normalization. In the 3rd Workshop on Computational Approaches to Linguistic Code-Switching Shared Task, we achieved second place with 62.76% harmonic mean F1-score for English-Spanish language pair without using any gazetteer and knowledge-based information.
引用
收藏
页码:110 / 114
页数:5
相关论文
共 5 条
  • [1] Improving Recognition of Out-of-vocabulary Words in E2E Code-switching ASR by Fusing Speech Generation Methods
    Ye, Lingxuan
    Cheng, Gaofeng
    Yang, Runyan
    Yang, Zehui
    Tian, Sanli
    Zhang, Pengyuan
    Yan, Yonghong
    INTERSPEECH 2022, 2022, : 3163 - 3167
  • [2] Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition
    Winata, Genta Indra
    Lin, Zhaojiang
    Shin, Jamin
    Liu, Zihan
    Fung, Pascale
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3541 - 3547
  • [3] MINER: Improving Out-of-Vocabulary Named Entity Recognition from an Information Theoretic Perspective
    Wang, Xiao
    Dou, Shihan
    Xiong, Limao
    Zou, Yicheng
    Zhang, Qi
    Gui, Tao
    Qiao, Liang
    Cheng, Zhanzhan
    Huang, Xuanjing
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5590 - 5600
  • [4] Learning Multilingual Meta-Embeddings for Code-Switching Named Entity Recognition
    Winata, Genta Indra
    Lin, Zhaojiang
    Fung, Pascale
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 181 - 186
  • [5] Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation
    Liang, Zheng
    Song, Zheshu
    Ma, Ziyang
    Du, Chenpeng
    Yu, Kai
    Chen, Xie
    INTERSPEECH 2023, 2023, : 919 - 923