Combining Discourse Markers and Cross-lingual Embeddings for Synonym-Antonym Classification

被引:0
|
作者
Roth, Michael [1 ]
Upadhyay, Shyam [2 ]
机构
[1] Univ Stuttgart, Inst Nat Language Proc, Stuttgart, Germany
[2] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is well-known that distributional semantic approaches have difficulty in distinguishing between synonyms and antonyms (Grefenstette, 1992; Pado and Lapata, 2003). Recent work has shown that supervision available in English for this task (e.g., lexical resources) can be transferred to other languages via cross-lingual word embeddings. However, this kind of transfer misses monolingual distributional information available in a target language, such as contrast relations that are indicative of antonymy (e.g., hot...while...cold). In this work, we improve the transfer by exploiting monolingual information, expressed in the form of co-occurrences with discourse markers that convey contrast. Our approach makes use of less than a dozen markers, which can easily be obtained for many languages. Compared to a baseline using only cross-lingual embeddings, we show absolute improvements of 410% F-1-score in Vietnamese and Hindi.
引用
收藏
页码:3899 / 3905
页数:7
相关论文
共 50 条
  • [21] Delexicalized Word Embeddings for Cross-lingual Dependency Parsing
    Dehouck, Mathieu
    Denis, Pascal
    15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 241 - 250
  • [22] Generalized Funnelling: Ensemble Learning and Heterogeneous Document Embeddings for Cross-Lingual Text Classification
    Moreo, Alejandro
    Pedrotti, Andrea
    Sebastiani, Fabrizio
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (02)
  • [23] Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings
    Wen-Yi, Andrea W.
    Mimno, David
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1124 - 1131
  • [24] Multilingual Offensive Language Identification with Cross-lingual Embeddings
    Ranasinghe, Tharindu
    Zampieri, Marcos
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5838 - 5844
  • [25] Cross-lingual Models of Word Embeddings: An Empirical Comparison
    Upadhyay, Shyam
    Faruqui, Manaal
    Dyer, Chris
    Roth, Dan
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1661 - 1670
  • [26] Cross-Lingual Classification of Crisis Data
    Khare, Prashant
    Burel, Gregoire
    Maynard, Diana
    Alani, Harith
    SEMANTIC WEB - ISWC 2018, PT I, 2018, 11136 : 617 - 633
  • [27] Cross-Lingual Web Spam Classification
    Garzo, Andras
    Daroczy, Balint
    Kiss, Tamas
    Siklosi, David
    Benczur, Andras A.
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 1149 - 1156
  • [28] Cross-lingual Distillation for Text Classification
    Xu, Ruochen
    Yang, Yiming
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1415 - 1425
  • [29] Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment
    Chen, Muhao
    Tian, Yingtao
    Yang, Mohan
    Zaniolo, Carlo
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1511 - 1517
  • [30] Multi-Adversarial Learning for Cross-Lingual Word Embeddings
    Wang, Haozhou
    Henderson, James
    Merlo, Paola
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 463 - 472