Katibeh: A Persian news summarizer using the novel semi-supervised approach

被引:1
|
作者
Farzi, Saeed [1 ]
Kianian, Sahar [2 ]
机构
[1] KN Toosi Univ Technol, Fac Comp Engn, Tehran, Iran
[2] Shahid Rajaee Teacher Training Univ, Fac Comp Engn, Tehran, Iran
关键词
D O I
10.1093/llc/fqy034
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
Nowadays, text summarization is one of the most important active research fields in information retrieval. The most of the supervised extractive summarization systems utilize learning-to-rank methods to score sentences according to their importance. They need a high-quality comprehensive summarization corpus, which is labeled manually by human experts. Unfortunately, this sort of corpus is not available for most low-resource languages such as Persian. In this study, first of all, a comprehensive human-labeled summarization corpus (called Bistoon) collected by the crowdsourcing approach is introduced, and then a Persian summarizer based on a novel semi-supervised summarization approach, which is a combination of co-training and self-training, is presented to overcome the absence of sufficient data. During an iterative process, the proposed system is learned by Bistoon corpus and applied to unlabeled texts to generate the most confident summaries. These summaries are added to Bistoon for more iterations. During iterations, the training corpus is grown and the quality of the summarizer is simultaneously improved. The proposed system has been compared to other well-known Persian summarizers over the Pasokh and Bistoon standard test data sets. The evaluation results show the superiority of our methods in terms of precision, F-measure, Rouge metrics, and also human judgments.
引用
收藏
页码:277 / 289
页数:13
相关论文
共 50 条
  • [41] A semi-supervised model for Persian rumor verification based on content information
    Jahanbakhsh-Nagadeh, Zoleikha
    Feizi-Derakhshi, Mohammad-Reza
    Sharifi, Arash
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 35267 - 35295
  • [42] Semi-supervised classification using bridging
    Chan, Jason
    Koprinska, Irena
    Poon, Josiah
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2008, 17 (03) : 415 - 431
  • [43] Categorization Using Semi-Supervised Clustering
    Hu, Jianying
    Singh, Moninder
    Mojsilovic, Aleksandra
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3666 - 3669
  • [44] A novel semi-supervised learning for face recognition
    Gao, Quanxue
    Huang, Yunfang
    Gao, Xinbo
    Shen, Weiguo
    Zhang, Hailin
    NEUROCOMPUTING, 2015, 152 : 69 - 76
  • [45] A Novel Initialization Method for Semi-supervised Clustering
    Dang, Yanzhong
    Xuan, Zhaoguo
    Rong, Lili
    Liu, Ming
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2010, 6291 : 317 - 328
  • [46] A Semi-Supervised Approach to Software Defect Prediction
    Lu, Huihua
    Cukic, Bojan
    Culp, Mark
    2014 IEEE 38TH ANNUAL INTERNATIONAL COMPUTERS, SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), 2014, : 416 - 425
  • [47] Semi-supervised approach to Romanian noun declension
    Octavia-Maria, Sulea
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS: PROCEEDINGS OF THE 20TH INTERNATIONAL CONFERENCE KES-2016, 2016, 96 : 664 - 671
  • [48] Image annotation with relevance feedback using a semi-supervised and hierarchical approach
    Chiang, Cheng-Chieh
    Hung, Ming-Wei
    Hung, Yi-Ping
    Leow, Wee Kheng
    VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2008, : 173 - +
  • [49] A Novel Semi-Supervised Dimensionality Reduction Framework
    Guo, Xin
    Tie, Yun
    Qi, Lin
    Guan, Ling
    IEEE MULTIMEDIA, 2016, 23 (02) : 28 - 41
  • [50] A genetic algorithm approach for semi-supervised clustering
    Demiriz, Ayhan
    Bennett, Kristin P.
    Embrechts, Mark J.
    International Journal of Smart Engineering System Design, 2002, 4 (01): : 21 - 30