Dissecting Contextual Word Embeddings: Architecture and Representation

被引：0

作者：

Peters, Matthew E. ^{[1
]}

Neumann, Mark ^{[1
]}

Zettlemoyer, Luke ^{[2
]}

Yih, Wen-tau ^{[1
]}

机构：

[1] Allen Inst Artificial Intelligence, Seattle, WA 98103 USA

[2] Univ Washington, Paul G Allen Comp Sci & Engn, Seattle, WA 98195 USA

来源：

2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018) | 2018年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.

引用

页码：1499 / 1509

页数：11

共 50 条

[31] The Role of Contextual Word Embeddings in Correcting the 'de/da' Clitic Errors in Turkish
Ozturk, Hasan
Degirmenci, Alperen
Gungor, Onur
Uskudarli, Suzan
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
[32] Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition
Chang, Jun
Han, Xiaohong
NEURAL PROCESSING LETTERS, 2023, 55 (07) : 8551 - 8567
[33] Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition
Jun Chang
Xiaohong Han
Neural Processing Letters, 2023, 55 : 8551 - 8567
[34] Text Representation Models based on the Spatial Distributional Properties of Word Embeddings
Unnam, Narendra Babu
Reddy, P. Krishna
Pandey, Amit
Manwani, Naresh
PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 603 - 604
[35] Beyond Accuracy: Measuring Representation Capacity of Embeddings to Preserve Structural and Contextual Information
Ali, Sarwan
INFORMATION MANAGEMENT AND BIG DATA, SIMBIG 2023, 2024, 2142 : 30 - 45
[36] A Hierarchical Book Representation of Word Embeddings for Effective Semantic Clustering and Search
Bleiweiss, Avi
ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2017, : 154 - 163
[37] Improved Word Sense Disambiguation via Prompt-based Contextual Word Representation
He, Qipeng
Zhang, Jian
Huang, Xueting
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[38] Sentiment Classification with Medical Word Embeddings and Sequence Representation for Drug Reviews
Liu, Sisi
Lee, Ickjai
HEALTH INFORMATION SCIENCE (HIS 2018), 2018, 11148 : 75 - 86
[39] Learning distributed word representation with multi-contextual mixed embedding
Li, Jianqiang
Li, Jing
Fu, Xianghua
Masud, M. A.
Huang, Joshua Zhexue
KNOWLEDGE-BASED SYSTEMS, 2016, 106 : 220 - 230
[40] Set-Word Embeddings and Semantic Indices: A New Contextual Model for Empirical Language Analysis
de Cordoba, Pedro Fernandez
Perez, Carlos A. Reyes
Arnau, Claudia Sanchez
Perez, Enrique A. Sanchez
COMPUTERS, 2025, 14 (01)

← 1 2 3 4 5 →