Label Attention Network for Structured Prediction

被引:4
|
作者
Cui, Leyang [1 ,2 ]
Li, Yafu [1 ,2 ]
Zhang, Yue [2 ,3 ]
机构
[1] Zhejiang Univ, Hangzhou 310007, Peoples R China
[2] Westlake Univ, Sch Engn, Hangzhou 310024, Peoples R China
[3] Westlake Inst Adv Study, Inst Adv Technol, Hangzhou 310024, Peoples R China
基金
美国国家科学基金会;
关键词
Labeling; Task analysis; Tagging; Artificial neural networks; Machine translation; Natural language processing; Encoding; Label attention; label dependency; sequence labeling;
D O I
10.1109/TASLP.2022.3145311
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Sequence labeling assigns a label to each token in a sequence, which is a fundamental problem in natural language processing (NLP). Many NLP tasks, including part-of-speech tagging and named entity recognition, can be solved in a form of sequence labeling problem. Other tasks such as constituency parsing and non-autoregressive machine translation can also be transformed into sequence labeling tasks. Neural models have been shown powerful for sequence labeling by employing a multi-layer sequence encoding network. Conditional random field (CRF) is proposed to enrich information over label sequences, yet it suffers large computational complexity and over-reliance on Marko assumption. To this end, we propose label attention network (LAN) to hierarchically refine representation of marginal label distributions bottom-up, enabling higher layers to learn more informed label sequence distribution based on information from lower layers. We demonstrate the effectiveness of LAN through extensive experiments on various NLP tasks including POS tagging, NER, CCG supertagging, constituency parsing and non-autoregressive machine translation. Empirical results show that LAN not only improves the overall tagging accuracy with similar number of parameters, but also significantly speeds up the training and testing compared to CRF.
引用
收藏
页码:1235 / 1248
页数:14
相关论文
共 50 条
  • [1] Label Attention Network for Structured Prediction
    Cui, Leyang
    Li, Yafu
    Zhang, Yue
    IEEE/ACM Transactions on Audio Speech and Language Processing, 2022, 30 : 1235 - 1248
  • [2] A Structured Prediction Approach for Label Ranking
    Korba, Anna
    Garcia, Alexandre
    d'Alche-Buc, Florence
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [3] Structured Prediction of Network Response
    Su, Hongyu
    Gionis, Aristides
    Rousu, Juho
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 442 - 450
  • [4] Differentiable Dynamic Programming for Structured Prediction and Attention
    Mensch, Arthur
    Blondel, Mathieu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [5] Pseudo-Label-Vector-Guided Parallel Attention Network for Remaining Useful Life Prediction
    Park, Ye-In
    Song, Jou Won
    Kang, Suk-Ju
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (04) : 5602 - 5611
  • [6] Structured Attention Network for Referring Image Segmentation
    Lin, Liang
    Yan, Pengxiang
    Xu, Xiaoqian
    Yang, Sibei
    Zeng, Kun
    Li, Guanbin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1922 - 1932
  • [7] POLAT: Protein function prediction based on soft mask graph network and residue-Label ATtention
    Liu, Yang
    Zhang, Yi
    Chen, Zihao
    Peng, Jing
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2024, 110
  • [8] A multimode structured prediction model based on dynamic attribution graph attention network for complex industrial processes
    Sun, Bei
    Lv, Mingjie
    Zhou, Can
    Li, Yonggang
    INFORMATION SCIENCES, 2023, 640
  • [9] A pseudo-label supervised graph fusion attention network for drug-target interaction prediction
    Xie, Yining
    Wang, Xiaodong
    Wang, Pengda
    Bi, Xueyan
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 259
  • [10] Hyperbolic Embedding Inference for Structured Multi-Label Prediction
    Xiong, Bo
    Cochez, Michael
    Nayyeri, Mojtaba
    Staab, Steffen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,