Mining Text Snippets for Images on the Web

被引:2
|
作者
Kannan, Anitha [1 ]
Baker, Simon [1 ]
Ramnath, Krishnan [1 ]
Fiss, Juliet [2 ]
Lin, Dahua [3 ]
Vanderwende, Lucy [1 ]
机构
[1] Microsoft, Redmond, WA 98052 USA
[2] Univ Washington, Seattle, WA 98195 USA
[3] TTI Chicago, Chicago, IL USA
关键词
Text mining for images; Text snippets; Interestingness; Relevance; Diversity; Browsing; Semantic image browsing; Web image augmentation;
D O I
10.1145/2623330.2623346
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Images are often used to convey many different concepts or illustrate many different stories. We propose an algorithm to mine multiple diverse, relevant, and interesting text snippets for images on the web. Our algorithm scales to all images on the web. For each image, all webpages that contain it are considered. The top-K text snippet selection problem is posed as combinatorial subset selection with the goal of choosing an optimal set of snippets that maximizes a combination of relevancy, interestingness, and diversity. The relevancy and interestingness are scored by machine learned models. Our algorithm is run at scale on the entire image index of a major search engine resulting in the construction of a database of images with their corresponding text snippets. We validate the quality of the database through a large-scale comparative study. We showcase the utility of the database through two web-scale applications: (a) augmentation of images on the web as webpages are browsed and (b) an image browsing experience (similar in spirit to web browsing) that is enabled by interconnecting semantically related images (which may not be visually related) through shared concepts in their corresponding text snippets.
引用
收藏
页码:1534 / 1543
页数:10
相关论文
共 50 条
  • [31] A framework of web-based text mining on the grid
    Yu, L
    Wang, SY
    Lai, KK
    Wu, Y
    INTERNATIONAL CONFERENCE ON NEXT GENERATION WEB SERVICES PRACTICES, 2005, : 97 - 102
  • [32] Mining Relevant Text Features for Retrieving Web Information
    Pipanmekaporn, Luepol
    Kamolsantiroj, Suwatchai
    2014 IIAI 3RD INTERNATIONAL CONFERENCE ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2014), 2014, : 447 - 452
  • [33] Intrusion detection in web applications using text mining
    Garcia Adeva, Juan Jose
    Atxa, Juan Manuel Pikatza
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (04) : 555 - 566
  • [34] The Research and Design of Web Text Mining System Framework
    Meng, Fanrong
    Jiang, Xiaoyun
    Shen, Lijun
    Shi, Lei
    DCABES 2008 PROCEEDINGS, VOLS I AND II, 2008, : 400 - +
  • [35] Web text mining method with word familiarity database
    Akihiro, K
    Tsutomu, F
    KNOWLEDGE-BASED INTELLIGENT INFORMATION ENGINEERING SYSTEMS & ALLIED TECHNOLOGIES, PTS 1 AND 2, 2001, 69 : 1415 - 1419
  • [36] A text mining based approach for web service classification
    Nisa, Rozina
    Qamar, Usman
    INFORMATION SYSTEMS AND E-BUSINESS MANAGEMENT, 2015, 13 (04) : 751 - 768
  • [37] Web service clustering using text mining techniques
    Liu, Wei
    Wong, Wilson
    International Journal of Agent-Oriented Software Engineering, 2009, 3 (01) : 6 - 26
  • [38] Application of An Improved DBSCAN Algorithm in Web Text Mining
    Xie Ping
    Zhang Lin
    Wang Ying
    Li Qinqian
    PROCEEDINGS OF THE 1ST INTERNATIONAL WORKSHOP ON CLOUD COMPUTING AND INFORMATION SECURITY (CCIS 2013), 2013, 52 : 400 - 403
  • [39] A text mining based approach for web service classification
    Rozina Nisa
    Usman Qamar
    Information Systems and e-Business Management, 2015, 13 : 751 - 768
  • [40] News item extraction for text mining in web newspapers
    Norvåg, K
    Oyri, R
    International Workshop on Challenges in Web Information Retrieval and Integration, Proceedings, 2005, : 195 - 204