A Comprehensive Survey and Prospect of Cross-Lingual Summarization Method Research

被引：0

作者：

Wang, Jing-Dong ^{[1
]}

Chang, Duo ^{[1
]}

Meng, Fan-Qi ^{[1
]}

Qu, Guangqiang ^{[2
]}

机构：

[1] School of computer science Northeast Electric Power University, No.169 Changchun Road, Jilin Province, Jilin City,132012, China

[2] Academic Affairs Office Northeast Electric Power University, No.169 Changchun Road, Jilin Province, Jilin City,132012, China

来源：

Journal of Network Intelligence | 2024年 / 9卷 / 01期

关键词：

Distillation - Large datasets - Learning systems - Natural language processing systems - Quality control - Translation (languages) - Zero-shot learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Cross-lingual summarization technology evolved from pipeline-based methods to today’s end-to-end approaches, although the problem of erroneous propagation is greatly avoided, there are still problems such as unclear nature of cross–lingual summarization, insufficient translation and summarization unification capabilities, scarcity of large-scale high-quality and multi-type datasets, insufficient research and exploration of low-resource cross-lingual summarization, and lack of multi-angle evaluation indicators. Therefore, according to the development context of Cross-lingual summarization, we first briefly introduce the pipeline-based first-translation-to-summarization method and the first-summarization and post-translation method, and then focus on zero-shot learning, multi-task learning, knowledge distillation method, knowledge enhancement method, pre-training framework and cross-lingual summarization method based on compression ratio, and then sort out the research progress of end-to-end Cross-lingual summarization, as well as the research motivation and content of various methods, and conduct in-depth comparative analysis. At the same time, since most of the world’s languages are low-resource, we emphasize and especially sort out the current status of low-resource Cross-lingual summarization research. Finally, we also introduce and analyze the dataset and evaluation indicators of Cross-lingual summarization. At the end we discussed the possible directions of future development and presented our own opinions. Through this comprehensive and in-depth survey, it is hoped that researchers interested in this field, especially in low-resource settings, will be helped to promote the further development of Cross-lingual summarization. © 2024, Taiwan Ubiquitous Information CO LTD. All rights reserved.

引用

页码：384 / 412

共 50 条

[21] Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
Zhang, Ran
Ouni, Jihed
Eger, Steffen
COMPUTATIONAL LINGUISTICS, 2024, 50 (03) : 1001 - 1047
[22] Cross-Lingual Sentiment Analysis: A Survey
Xu Y.
Cao H.
Wang W.
Du W.
Xu C.
Data Analysis and Knowledge Discovery, 2023, 7 (01) : 1 - 21
[23] Multi-Task Learning for Cross-Lingual Abstractive Summarization
Takase, Sho
Okazaki, Naoaki
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3008 - 3016
[24] Multi-Task Learning for Cross-Lingual Abstractive Summarization
Takase, Sho
Okazaki, Naoaki
2022 Language Resources and Evaluation Conference, LREC 2022, 2022, : 3008 - 3016
[25] PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India
Urlanal, Ashok
Chen, Pinzhen
Zhao, Zheng
Cohen, Shay B.
Shrivastava, Manish
Haddow, Barry
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11606 - 11628
[26] Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline
Karande, Pranav
Sarkar, Balaram
Maurya, Chandresh Kumar
SPEECH AND COMPUTER, SPECOM 2024, PT I, 2025, 15299 : 119 - 133
[27] Unifying Cross-lingual Summarization and Machine Translation with Compression Rate
Bai, Yu
Huang, Heyan
Fan, Kai
Gao, Yang
Zhu, Yiming
Zhan, Jiaao
Chi, Zewen
Chen, Boxing
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1087 - 1097
[28] WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization
Ladhak, Faisal
Durmus, Esin
Cardie, Claire
McKeown, Kathleen
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4034 - 4048
[29] CATAMARAN: A Cross-lingual Long Text Abstractive Summarization Dataset
Chen, Zheng
Lin, Hongyu
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6932 - 6937
[30] A survey of cross-lingual word embedding models
Ruder, Sebastian
Vulić, Ivan
Søgaard, Anders
Journal of Artificial Intelligence Research, 2019, 65 : 569 - 631

← 1 2 3 4 5 →