Label Attention Network for Structured Prediction

被引：4

作者：

Cui, Leyang ^{[1
,2
]}

Li, Yafu ^{[1
,2
]}

Zhang, Yue ^{[2
,3
]}

机构：

[1] Zhejiang Univ, Hangzhou 310007, Peoples R China

[2] Westlake Univ, Sch Engn, Hangzhou 310024, Peoples R China

[3] Westlake Inst Adv Study, Inst Adv Technol, Hangzhou 310024, Peoples R China

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2022年 / 30卷

基金：

美国国家科学基金会;

关键词：

Labeling; Task analysis; Tagging; Artificial neural networks; Machine translation; Natural language processing; Encoding; Label attention; label dependency; sequence labeling;

D O I：

10.1109/TASLP.2022.3145311

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Sequence labeling assigns a label to each token in a sequence, which is a fundamental problem in natural language processing (NLP). Many NLP tasks, including part-of-speech tagging and named entity recognition, can be solved in a form of sequence labeling problem. Other tasks such as constituency parsing and non-autoregressive machine translation can also be transformed into sequence labeling tasks. Neural models have been shown powerful for sequence labeling by employing a multi-layer sequence encoding network. Conditional random field (CRF) is proposed to enrich information over label sequences, yet it suffers large computational complexity and over-reliance on Marko assumption. To this end, we propose label attention network (LAN) to hierarchically refine representation of marginal label distributions bottom-up, enabling higher layers to learn more informed label sequence distribution based on information from lower layers. We demonstrate the effectiveness of LAN through extensive experiments on various NLP tasks including POS tagging, NER, CCG supertagging, constituency parsing and non-autoregressive machine translation. Empirical results show that LAN not only improves the overall tagging accuracy with similar number of parameters, but also significantly speeds up the training and testing compared to CRF.

引用

页码：1235 / 1248

页数：14

共 50 条

[1] Label Attention Network for Structured Prediction
Cui, Leyang
Li, Yafu
Zhang, Yue
IEEE/ACM Transactions on Audio Speech and Language Processing, 2022, 30 : 1235 - 1248
[2] A Structured Prediction Approach for Label Ranking
Korba, Anna
Garcia, Alexandre
d'Alche-Buc, Florence
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[3] Structured Prediction of Network Response
Su, Hongyu
Gionis, Aristides
Rousu, Juho
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 442 - 450
[4] Differentiable Dynamic Programming for Structured Prediction and Attention
Mensch, Arthur
Blondel, Mathieu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[5] Pseudo-Label-Vector-Guided Parallel Attention Network for Remaining Useful Life Prediction
Park, Ye-In
Song, Jou Won
Kang, Suk-Ju
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (04) : 5602 - 5611
[6] Structured Attention Network for Referring Image Segmentation
Lin, Liang
Yan, Pengxiang
Xu, Xiaoqian
Yang, Sibei
Zeng, Kun
Li, Guanbin
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1922 - 1932
[7] POLAT: Protein function prediction based on soft mask graph network and residue-Label ATtention
Liu, Yang
Zhang, Yi
Chen, Zihao
Peng, Jing
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2024, 110
[8] A multimode structured prediction model based on dynamic attribution graph attention network for complex industrial processes
Sun, Bei
Lv, Mingjie
Zhou, Can
Li, Yonggang
INFORMATION SCIENCES, 2023, 640
[9] A pseudo-label supervised graph fusion attention network for drug-target interaction prediction
Xie, Yining
Wang, Xiaodong
Wang, Pengda
Bi, Xueyan
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 259
[10] Hyperbolic Embedding Inference for Structured Multi-Label Prediction
Xiong, Bo
Cochez, Michael
Nayyeri, Mojtaba
Staab, Steffen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,

← 1 2 3 4 5 →