Mining Text Snippets for Images on the Web

被引:2
|
作者
Kannan, Anitha [1 ]
Baker, Simon [1 ]
Ramnath, Krishnan [1 ]
Fiss, Juliet [2 ]
Lin, Dahua [3 ]
Vanderwende, Lucy [1 ]
机构
[1] Microsoft, Redmond, WA 98052 USA
[2] Univ Washington, Seattle, WA 98195 USA
[3] TTI Chicago, Chicago, IL USA
关键词
Text mining for images; Text snippets; Interestingness; Relevance; Diversity; Browsing; Semantic image browsing; Web image augmentation;
D O I
10.1145/2623330.2623346
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Images are often used to convey many different concepts or illustrate many different stories. We propose an algorithm to mine multiple diverse, relevant, and interesting text snippets for images on the web. Our algorithm scales to all images on the web. For each image, all webpages that contain it are considered. The top-K text snippet selection problem is posed as combinatorial subset selection with the goal of choosing an optimal set of snippets that maximizes a combination of relevancy, interestingness, and diversity. The relevancy and interestingness are scored by machine learned models. Our algorithm is run at scale on the entire image index of a major search engine resulting in the construction of a database of images with their corresponding text snippets. We validate the quality of the database through a large-scale comparative study. We showcase the utility of the database through two web-scale applications: (a) augmentation of images on the web as webpages are browsed and (b) an image browsing experience (similar in spirit to web browsing) that is enabled by interconnecting semantically related images (which may not be visually related) through shared concepts in their corresponding text snippets.
引用
收藏
页码:1534 / 1543
页数:10
相关论文
共 50 条
  • [1] Clustering web images by correlation mining of image-text
    Wu F.
    Han Y.-H.
    Zhuang Y.-T.
    Shao J.
    Ruan Jian Xue Bao/Journal of Software, 2010, 21 (07): : 1561 - 1575
  • [2] Research on Web Text Mining
    Ruan Guangce
    INFORMATION COMPUTING AND APPLICATIONS, PT 2, 2012, 308 : 783 - 790
  • [3] Extending Web Mining to Digital Forensics Text Mining
    Hicks, Chelsea
    Beebe, Nicole Lang
    Haliscak, Brandi
    AMCIS 2016 PROCEEDINGS, 2016,
  • [4] Applying passage in Web text mining
    Theeramunkong, T
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2004, 19 (1-2) : 149 - 158
  • [5] DATA PREPROCESSING IN WEB TEXT MINING
    Jiang Yongbo
    FIFTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2012), 2012, : 573 - 581
  • [6] A Parallel Platform for Web Text Mining
    Ping Lu
    Zhenjiang Dong
    Shengmei Luo
    Lixia Liu
    Shanshan Guan
    Shengyu Liu
    Qingcai Chen
    ZTE Communications, 2013, 11 (03) : 56 - 61
  • [7] Guest Editorial: Text and Web Mining
    Ah-Hwee Tan
    Philip S. Yu
    Applied Intelligence, 2003, 18 : 239 - 241
  • [8] Guest editorial: Text and web mining
    Tan, AH
    Yu, PS
    APPLIED INTELLIGENCE, 2003, 18 (03) : 239 - 241
  • [9] A Web Text Mining Flexible Architecture
    Castellano, M.
    Mastronardi, G.
    Aprile, A.
    Tarricone, G.
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007, 2007, 26 : 78 - +
  • [10] Text area identification in Web images
    Perantonis, SJ
    Gatos, B
    Maragos, V
    Karkaletsis, V
    Petasis, G
    METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3025 : 82 - 92