Automatic Summarization and Keyword Extraction from Web Page or Text File

被引:0
|
作者
You, Xiangdong [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Key Lab Universal Wireless Commun, Minist Educ, Beijing 100876, Peoples R China
关键词
automatic summarization; keyword extraction; readability; textrank;
D O I
10.1109/ccet48361.2019.8989315
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we study the automatic summarization and keyword extraction techniques for web page and text file. First, we use the Readability algorithm to extract the text of the web page, and study the PageRank algorithm and TextRank algorithm, and then use the TextRank algorithm to extract keywords, key sentences and abstracts. We also develop the web application that processes web page and text file. The application can input URL, text file, or text paragraph, then application can complete the extraction of main content, abstract, keywords and key sentences.
引用
收藏
页码:154 / 158
页数:5
相关论文
共 50 条
  • [1] Chinese Automatic Text Summarization Based on Keyword Extraction
    Jiang Xiao-yu
    FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 225 - 228
  • [2] Automatic Keyword Extraction for Text Summarization in e-Newspapers
    Thomas, Justine Raju
    Bharti, Santosh Kumar
    Babu, Korra Sathya
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATICS AND ANALYTICS (ICIA' 16), 2016,
  • [3] Text Summarization with Automatic Keyword Extraction in Telugu e-Newspapers
    Naidu, Reddy
    Bharti, Santosh Kumar
    Babu, Korra Sathya
    Mohapatra, Ramesh Kumar
    SMART COMPUTING AND INFORMATICS, 2018, 77 : 555 - 564
  • [4] Language-independent extractive automatic text summarization based on automatic keyword extraction
    Hernandez-Castaneda, Angel
    Arnulfo Garcia-Hernandez, Rene
    Ledeneva, Yulia
    Eduardo Millan-Hernandez, Christian
    COMPUTER SPEECH AND LANGUAGE, 2022, 71
  • [5] Automatic text summarization based on keyword derivation
    Ando, K
    Yamasaki, T
    Shishibori, M
    Aoe, JI
    2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE, 2002, : 464 - 469
  • [6] Automatic Keyword Extraction From Dialogue Text
    Sali, Yusuf
    Erden, Mustafa
    2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
  • [7] HTML text segmentation for Web page summarization by a key sentence extraction method
    Sunayama, Wataru
    Iyama, Akihiro
    Yachida, Masahiko
    Systems and Computers in Japan, 2006, 37 (07): : 26 - 36
  • [8] Automatic Summarization of Web Page Based on Statistics and Structure
    Zheng, Shuangyi
    Yu, Junyang
    KNOWLEDGE DISCOVERY AND DATA MINING, 2012, 135 : 643 - +
  • [9] Automatic text summarization for web pages on Internet
    State Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China
    Jisuanji Gongcheng, 2006, 3 (88-90):
  • [10] AUTOMATIC TEXT SUMMARIZATION USING PAGE RANK AND GENETIC ALGORITHM
    Gupta, Shashank
    Jagrawal, Anushree
    Mathur, Neha
    JOURNAL OF RAJASTHAN ACADEMY OF PHYSICAL SCIENCES, 2014, 13 (02): : 171 - 179