A Comprehensive Survey and Prospect of Cross-Lingual Summarization Method Research

被引:0
|
作者
Wang, Jing-Dong [1 ]
Chang, Duo [1 ]
Meng, Fan-Qi [1 ]
Qu, Guangqiang [2 ]
机构
[1] School of computer science Northeast Electric Power University, No.169 Changchun Road, Jilin Province, Jilin City,132012, China
[2] Academic Affairs Office Northeast Electric Power University, No.169 Changchun Road, Jilin Province, Jilin City,132012, China
来源
Journal of Network Intelligence | 2024年 / 9卷 / 01期
关键词
Distillation - Large datasets - Learning systems - Natural language processing systems - Quality control - Translation (languages) - Zero-shot learning;
D O I
暂无
中图分类号
学科分类号
摘要
Cross-lingual summarization technology evolved from pipeline-based methods to today’s end-to-end approaches, although the problem of erroneous propagation is greatly avoided, there are still problems such as unclear nature of cross–lingual summarization, insufficient translation and summarization unification capabilities, scarcity of large-scale high-quality and multi-type datasets, insufficient research and exploration of low-resource cross-lingual summarization, and lack of multi-angle evaluation indicators. Therefore, according to the development context of Cross-lingual summarization, we first briefly introduce the pipeline-based first-translation-to-summarization method and the first-summarization and post-translation method, and then focus on zero-shot learning, multi-task learning, knowledge distillation method, knowledge enhancement method, pre-training framework and cross-lingual summarization method based on compression ratio, and then sort out the research progress of end-to-end Cross-lingual summarization, as well as the research motivation and content of various methods, and conduct in-depth comparative analysis. At the same time, since most of the world’s languages are low-resource, we emphasize and especially sort out the current status of low-resource Cross-lingual summarization research. Finally, we also introduce and analyze the dataset and evaluation indicators of Cross-lingual summarization. At the end we discussed the possible directions of future development and presented our own opinions. Through this comprehensive and in-depth survey, it is hoped that researchers interested in this field, especially in low-resource settings, will be helped to promote the further development of Cross-lingual summarization. © 2024, Taiwan Ubiquitous Information CO LTD. All rights reserved.
引用
收藏
页码:384 / 412
相关论文
共 50 条
  • [21] Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
    Zhang, Ran
    Ouni, Jihed
    Eger, Steffen
    COMPUTATIONAL LINGUISTICS, 2024, 50 (03) : 1001 - 1047
  • [22] Cross-Lingual Sentiment Analysis: A Survey
    Xu Y.
    Cao H.
    Wang W.
    Du W.
    Xu C.
    Data Analysis and Knowledge Discovery, 2023, 7 (01) : 1 - 21
  • [23] Multi-Task Learning for Cross-Lingual Abstractive Summarization
    Takase, Sho
    Okazaki, Naoaki
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3008 - 3016
  • [24] Multi-Task Learning for Cross-Lingual Abstractive Summarization
    Takase, Sho
    Okazaki, Naoaki
    2022 Language Resources and Evaluation Conference, LREC 2022, 2022, : 3008 - 3016
  • [25] PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India
    Urlanal, Ashok
    Chen, Pinzhen
    Zhao, Zheng
    Cohen, Shay B.
    Shrivastava, Manish
    Haddow, Barry
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11606 - 11628
  • [26] Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline
    Karande, Pranav
    Sarkar, Balaram
    Maurya, Chandresh Kumar
    SPEECH AND COMPUTER, SPECOM 2024, PT I, 2025, 15299 : 119 - 133
  • [27] Unifying Cross-lingual Summarization and Machine Translation with Compression Rate
    Bai, Yu
    Huang, Heyan
    Fan, Kai
    Gao, Yang
    Zhu, Yiming
    Zhan, Jiaao
    Chi, Zewen
    Chen, Boxing
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1087 - 1097
  • [28] WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization
    Ladhak, Faisal
    Durmus, Esin
    Cardie, Claire
    McKeown, Kathleen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4034 - 4048
  • [29] CATAMARAN: A Cross-lingual Long Text Abstractive Summarization Dataset
    Chen, Zheng
    Lin, Hongyu
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6932 - 6937
  • [30] A survey of cross-lingual word embedding models
    Ruder, Sebastian
    Vulić, Ivan
    Søgaard, Anders
    Journal of Artificial Intelligence Research, 2019, 65 : 569 - 631