CAKES: Cross-lingual Wikipedia Knowledge Enrichment and Summarization

被引:0
|
作者
Fionda, Valeria [1 ]
Pirro, Giuseppe [1 ]
机构
[1] Free Univ Bolzano Bozen, Bolzano, Italy
关键词
D O I
10.3233/978-1-61499-098-7-901
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Wikipedia is a huge source of multilingual knowledge curated by human contributors. Wiki articles are independently written in the various languages and may cover different perspectives about a given subject. The aim of this paper is to exploit Wikipedia multilingual information for knowledge enrichment and summarization. Investigating the link structure of a Wiki article in a source language and comparing it with the structure of articles about the same subject written in other languages gives insights about the body of knowledge shared among languages. This investigation is also useful to identify knowledge perspectives not covered in the source language but covered in other languages. We implemented these ideas in CAKES, which: i) exploits Wikipedia information on the fly without requiring any data preprocessing; ii) enables to specify the set of languages to be considered and; iii) ranks subjects interesting for a given article on the basis of their popularity among languages.
引用
收藏
页码:901 / 902
页数:2
相关论文
共 50 条
  • [31] Exploiting Wikipedia and EuroWordNet to solve Cross-Lingual Question Answering
    Ferrandez, Sergio
    Toral, Antonio
    Ferrandez, Oscar
    Ferrandez, Antonio
    Munoz, Rafael
    INFORMATION SCIENCES, 2009, 179 (20) : 3473 - 3488
  • [32] English-to-Korean Cross-Lingual Link Detection for Wikipedia
    Marigomen, Ralph
    Kang, In-Su
    U- AND E-SERVICE, SCIENCE AND TECHNOLOGY, 2011, 264 : 274 - 280
  • [33] Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
    Zhang, Ran
    Ouni, Jihed
    Eger, Steffen
    COMPUTATIONAL LINGUISTICS, 2024, 50 (03) : 1001 - 1047
  • [34] Multi-Task Learning for Cross-Lingual Abstractive Summarization
    Takase, Sho
    Okazaki, Naoaki
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3008 - 3016
  • [35] A Comprehensive Survey and Prospect of Cross-Lingual Summarization Method Research
    Wang, Jing-Dong
    Chang, Duo
    Meng, Fan-Qi
    Qu, Guangqiang
    Journal of Network Intelligence, 2024, 9 (01): : 384 - 412
  • [36] Multi-Task Learning for Cross-Lingual Abstractive Summarization
    Takase, Sho
    Okazaki, Naoaki
    2022 Language Resources and Evaluation Conference, LREC 2022, 2022, : 3008 - 3016
  • [37] PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India
    Urlanal, Ashok
    Chen, Pinzhen
    Zhao, Zheng
    Cohen, Shay B.
    Shrivastava, Manish
    Haddow, Barry
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11606 - 11628
  • [38] Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline
    Karande, Pranav
    Sarkar, Balaram
    Maurya, Chandresh Kumar
    SPEECH AND COMPUTER, SPECOM 2024, PT I, 2025, 15299 : 119 - 133
  • [39] Unifying Cross-lingual Summarization and Machine Translation with Compression Rate
    Bai, Yu
    Huang, Heyan
    Fan, Kai
    Gao, Yang
    Zhu, Yiming
    Zhan, Jiaao
    Chi, Zewen
    Chen, Boxing
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1087 - 1097
  • [40] WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization
    Ladhak, Faisal
    Durmus, Esin
    Cardie, Claire
    McKeown, Kathleen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4034 - 4048