Sentence alignment for monolingual comparable corpora

被引:0
|
作者
Barzilay, R [1 ]
Elhadad, N [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the problem of sentence alignment for monolingual corpora, a phenomenon distinct from alignment in parallel corpora. Aligning large comparable corpora automatically would provide a valuable resource for learning of text-to-text rewriting rules. We incorporate context into the search for an optimal alignment in two complementary ways: learning rules for matching paragraphs using topic structure and further refining the matching through local alignment to find good sentence pairs. Evaluation shows that our alignment method outperforms state-of-the-art systems developed for the same task.
引用
收藏
页码:25 / 32
页数:8
相关论文
共 50 条
  • [41] Revisiting comparable corpora in connected space
    Zweigenbaum, Pierre
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [42] Repetition and Language Models and Comparable Corpora
    Church, Ken
    BUCC 2009 - 2nd Workshop on Building and Using Comparable Corpora: From Parallel to Non-Parallel Corpora at the ACL-IJCNLP 2009 - Proceedings, 2009,
  • [43] Wikipedia as Multilingual Source of Comparable Corpora
    Gamallo Otero, Pablo
    Gonzalez Lopez, Isaac
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 21 - 25
  • [44] NOISY-PARALLEL AND COMPARABLE CORPORA FILTERING METHODOLOGY FOR THE EXTRACTION OF BI-LINGUAL EQUIVALENT DATA AT SENTENCE LEVEL
    Wolk, Krzysztof
    COMPUTER SCIENCE-AGH, 2015, 16 (02): : 169 - 184
  • [45] Use of comparable monolingual corpora for the translation into French of verbal core units from the economic-financial section of the newspaper El Pais
    Le Poder, Marie-Evelyne
    BABEL-REVUE INTERNATIONALE DE LA TRADUCTION-INTERNATIONAL JOURNAL OF TRANSLATION, 2018, 64 (02): : 294 - 325
  • [46] Towards Producing Bilingual Lexica from Monolingual Corpora
    Han, Jingyi
    Bel, Nuria
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2222 - 2227
  • [47] Alignment by bilingual generation and monolingual derivation
    Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto, 606-8501, Japan
    Int. Conf. Comput. Linguist. - Proc. COLING: Tech. Pap., 1600, (1963-1978):
  • [48] Sentence repetition as a measure of morphosyntax in monolingual and bilingual children
    Komeili, Mariam
    Marshall, Chloe R.
    CLINICAL LINGUISTICS & PHONETICS, 2013, 27 (02) : 152 - 161
  • [49] NP alignment in bilingual corpora
    Recski, Gabor
    Rung, Andras
    Zsedar, Atila
    Kornai, Andras
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 3379 - 3382
  • [50] Building English - Punjabi Aligned Parallel Corpora of Nouns from Comparable Corpora
    Kaur, Dilshad
    Singh, Satwinder
    APPLIED COMPUTER SYSTEMS, 2023, 28 (02) : 245 - 251