Keyphrase extraction from Chinese news web pages based on semantic relations

被引:0
|
作者
Xie, Fei [1 ,4 ]
Wu, Xindong [1 ,2 ]
Hu, Xue-Gang [1 ]
Wang, Fei-Yue [3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
[2] Univ Vermont, Dept Comp Sci, Burlington, VT 50405 USA
[3] Chinese Acad Sci, Inst Automat, Beijing 100864, Peoples R China
[4] Hefei Teachers Coll, Dept Comp Sci & Technol, Hefei 230061, Peoples R China
关键词
keyphrase extraction; semantic relation; word similarity; word co-occurrence; lexical chain;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrases are very useful for saving time on browsing through the news web pages. A new keyphrase extraction method from Chinese news web pages based on semantic relations is presented in this paper. Semantic relations between phrases are analyzed, and a lexical chain is used to construct a semantic relation graph. Keyphrases are extracted and a semantic link graph is built on the lexical chains. News web pages with core hints are selected from www.163.com to test our method. The experimental results show that the proposed method substantially outperforms the method based on term frequency, especially when the number of keyphrases extracted is 3 - the precision is improved by 26.97 percent, and the recall is improved by 20.93 percent.
引用
收藏
页码:490 / +
页数:2
相关论文
共 50 条
  • [1] Keyphrase extraction from Chinese news web pages based on semantic relations
    Xie, Fei
    Wu, Xindong
    Hu, Xue-Gang
    Wang, Fei-Yue
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2008, 5075 : 490 - 495
  • [2] Unsupervised Keyphrase Extraction for Web Pages
    Haarman, Tim
    Zijlema, Bastiaan
    Wiering, Marco
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2019, 3 (03)
  • [3] Turkish Keyphrase Extraction from Web Pages with BERT
    Ayan, Emre Tolga
    Arslan, Rabia
    Zengin, Muhammed Said
    Duru, Haci Ali
    Salman, Sedat
    Bardak, Batuhan
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [4] Keyword extraction based on lexical chains for Chinese news web pages
    Hu, Xue-Gang
    Li, Xing-Hua
    Xie, Fei
    Wu, Xin-Dong
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2010, 23 (01): : 45 - 51
  • [5] Automatic keyphrase extraction from chinese news documents
    Wang, HF
    Li, SJ
    Yu, SW
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 648 - 657
  • [6] Structrued and semantic data extraction from Web pages
    Gan, Y
    Zhang, SZ
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 2930 - 2935
  • [7] Content Extraction from Web Pages Based on Chinese Punctuation Number
    Song, Mingqiu
    Wu, Xintao
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5573 - 5575
  • [8] Automatic Keyphrase Extraction from Persian Scientific Documents Using Semantic Relations
    Farahani, Bahare Davoodabadi
    Fatemi, Seied Omid
    Ghorbani, Mohsen
    2019 27TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2019), 2019, : 1972 - 1978
  • [9] Automatic Extraction of Textual Elements from News Web Pages
    Ibrahim, Hossam
    Darwish, Kareem
    Abdel-sabor, Abdel-Rahim
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1600 - 1603
  • [10] Extraction of web news from web pages using a ternary tree approach
    Laishram, Debina
    Sebastian, Merin
    2015 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATION ENGINEERING ICACCE 2015, 2015, : 628 - 633