Web directory construction using lexical chains

被引:0
|
作者
Stamou, S [1 ]
Krikos, V
Kokosis, P
Ntoulas, A
Christodoulakis, D
机构
[1] Univ Patras, Dept Comp Engn, Comp Technol Inst, GR-26500 Patras, Greece
[2] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Web Directories provide a way of locating relevant information on the Web. Typically, Web Directories rely on humans putting in significant time and effort into finding important pages on the Web and categorizing them in the Directory. In this paper we present a way for automating the creation of a Web Directory. At a high level, our method takes as input a subject hierarchy and a collection of pages. We first leverage a variety of lexical resources from the Natural Language Processing community to enrich our hierarchy. After that, we process the pages and identify sequences of important terms, which are referred to as lexical chains. Finally, we use the lexical chains in order to decide where in the enriched subject hierarchy we should assign every page. Our experimental results with real Web data show that our method is quite promising into assisting humans during page categorization.
引用
收藏
页码:138 / 149
页数:12
相关论文
共 50 条
  • [41] MIXED-SENTIMENT CLASSIFICATION OF WEB FORUM POSTS USING LEXICAL AND NON-LEXICAL FEATURES
    Khan, Hikmat Ullah
    JOURNAL OF WEB ENGINEERING, 2017, 16 (1-2): : 161 - 176
  • [42] English Light Verb Construction Identification Using Lexical Knowledge
    Chen, Wei-Te
    Bonial, Claire
    Palmer, Martha
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2375 - 2381
  • [43] SEGMENTING TEXT BY LEXICAL CHAINS DISTRIBUTION
    Tatar, Doina
    Tamaianu-Morita, Emma
    Czibula, Gabriela
    KEPT 2009: KNOWLEDGE ENGINEERING PRINCIPLES AND TECHNIQUES, 2009, : 69 - 76
  • [44] A Method to Build Usable Directory System Using 3D Web-Based Simulation and Directory Mode
    Lioe, Lienny Natalia
    Heryadi, Yaya
    2012 IEEE CONFERENCE ON CONTROL, SYSTEMS & INDUSTRIAL INFORMATICS (ICCSII), 2012, : 235 - 239
  • [45] Classifying Web data in directory structures
    Stamou, S
    Ntoulas, A
    Krikos, V
    Kokosis, P
    Christodoulakis, D
    FRONTIERS OF WWW RESEARCH AND DEVELOPMENT - APWEB 2006, PROCEEDINGS, 2006, 3841 : 238 - 249
  • [46] A WEB SERVICE FRAMEWORK FOR ENVIRONMENTAL AND CARBON FOOTPRINT MONITORING IN CONSTRUCTION SUPPLY CHAINS
    Cheng, Jack C. P.
    Law, Kincho H.
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON SUSTAINABLE URBANIZATION (ICSU 2010), 2010, : 1553 - 1562
  • [47] Materials consultants directory launched on the Web
    不详
    MATERIALS WORLD, 2000, 8 (10) : 52 - 52
  • [48] Using Lexical Chains to Identify Text Difficulty: A Corpus Statistics and Classification Study
    Mukherjee, Partha
    Leroy, Gondy
    Kauchak, David
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2019, 23 (05) : 2164 - 2173
  • [49] Structured and unstructured document summarization: Design of a commercial summarizer using lexical chains
    Alam, H
    Kumar, A
    Nakamura, M
    Rahman, F
    Tarnikova, Y
    Wilcox, C
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 1147 - 1152
  • [50] Stable web spam detection using features based on lexical items
    Luckner, Marcin
    Gad, Michal
    Sobkowiak, Pawel
    COMPUTERS & SECURITY, 2014, 46 : 79 - 93