Incorporating Linguistic Information to Statistical Word-Level Alignment

被引:0
|
作者
Cendejas, Eduardo [1 ]
Barcelo, Grettel [1 ]
Gelbukh, Alexander [1 ]
Sidorov, Grigori [1 ]
机构
[1] Natl Polytech Inst, Ctr Res Comp, Mexico City, DF, Mexico
关键词
Parallel texts; word alignment; linguistic information; dictionary; cognates; semantic domains; morphological information;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Parallel texts are enriched by alignment algorithms, thus establishing a relationship between the structures of the implied languages. Depending on the alignment level, the enrichment can be performed on paragraphs, sentences or words, of the expressed content in the source language and its translation. There are two main approaches to perform word-level alignment: statistical or linguistic. Due to the dissimilar grammar rules the languages have, the statistical algorithms usually give lower precision. That is why the development of this type of algorithms is generally aimed at a specific language pair using linguistic techniques. A hybrid alignment system based on the combination of the two traditional approaches is presented in this paper. It provides user-friendly configuration and is adaptable to the computational environment. The system uses linguistic resources and procedures such as identification of cognates, morphological information, syntactic trees, dictionaries, and semantic domains. We show that the system outperforms existing algorithms.
引用
收藏
页码:387 / 394
页数:8
相关论文
共 50 条
  • [41] Word-level Sentiment Visualizer for Financial Documents
    Ito, Tomoki
    Tsubouchi, Kota
    Sakaji, Hiroki
    Yamashita, Tatsuo
    Izumi, Kiyoshi
    2019 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR FINANCIAL ENGINEERING & ECONOMICS (CIFER 2019), 2019, : 27 - 33
  • [42] Word-level prominence in Persian: An Experimental Study
    Sadeghi, Vahid
    LANGUAGE AND SPEECH, 2017, 60 (04) : 571 - 596
  • [43] Polynomial Word-Level Verification of Arithmetic Circuits
    Barhoush, Mohammed
    Mahzoon, Alireza
    Drechsler, Rolf
    2021 19TH ACM-IEEE INTERNATIONAL CONFERENCE ON FORMAL METHODS AND MODELS FOR SYSTEM DESIGN (MEMOCODE), 2022, : 1 - 9
  • [44] Knowledge sources for word-level translation models
    Koehn, P
    Knight, K
    PROCEEDINGS OF THE 2001 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2001, : 27 - 35
  • [45] Grouping heuristics for word-level decision diagrams
    Drechsler, R
    Herbstritt, M
    Becker, B
    ISCAS '99: PROCEEDINGS OF THE 1999 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 1: VLSI, 1999, : 411 - 414
  • [46] Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?
    Godin, Frederic
    Demuynck, Kris
    Dambre, Joni
    De Neve, Wesley
    Demeester, Thomas
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3275 - 3284
  • [47] Linguistic Contributions to Word-Level Spelling Accuracy in Elementary School Children With and Without Specific Language Impairment
    Werfel, Krystal L.
    Schuele, C. Melanie
    Reed, Paul
    AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2019, 28 (02) : 599 - 611
  • [48] Word-level Sign Language Recognition Using Linguistic Adaptation of 77 GHz FMCW Radar Data
    Rahman, M. Mahbubur
    Mdrafi, Robiulhossain
    Gurbuz, Ali C.
    Malaia, Evie
    Crawford, Chris
    Griffin, Darrin
    Gurbuz, Sevgi Z.
    2021 IEEE RADAR CONFERENCE (RADARCONF21): RADAR ON THE MOVE, 2021,
  • [49] Word-Level Identification of Romanized Tunisian Dialect
    Aridhi, Chaima
    Achour, Hadhemi
    Souissi, Emna
    Younes, Jihene
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 170 - 175
  • [50] Exploiting word-level features for emotion prediction
    Nicholas, Greg
    Rotaru, Mihai
    Litman, Diane J.
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 110 - +