Large Scale Retrieval and Generation of Image Descriptions

被引:48
|
作者
Ordonez, Vicente [1 ]
Han, Xufeng [1 ]
Kuznetsova, Polina [2 ]
Kulkarni, Girish [2 ]
Mitchell, Margaret [3 ]
Yamaguchi, Kota [4 ]
Stratos, Karl [5 ]
Goyal, Amit [6 ]
Dodge, Jesse [7 ]
Mensch, Alyssa [8 ]
Daume, Hal, III [9 ]
Berg, Alexander C. [1 ]
Choi, Yejin [10 ]
Berg, Tamara L. [1 ]
机构
[1] Univ N Carolina, Chapel Hill, NC 27599 USA
[2] SUNY Stony Brook, Stony Brook, NY 11794 USA
[3] Microsoft Res, Redmond, WA USA
[4] Tohoku Univ, Sendai, Miyagi, Japan
[5] Columbia Univ, New York, NY USA
[6] Yahoo Labs, Sunnyvale, CA USA
[7] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[8] Univ Penn, Philadelphia, PA 19104 USA
[9] Univ Maryland, College Pk, MD 20742 USA
[10] Univ Washington, Seattle, WA 98195 USA
关键词
Retrieval; Image description; Data driven; Big data; Natural language processing; SCENE;
D O I
10.1007/s11263-015-0840-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
What is the story of an image? What is the relationship between pictures, language, and information we can extract using state of the art computational recognition systems? In an attempt to address both of these questions, we explore methods for retrieving and generating natural language descriptions for images. Ideally, we would like our generated textual descriptions (captions) to both sound like a person wrote them, and also remain true to the image content. To do this we develop data-driven approaches for image description generation, using retrieval-based techniques to gather either: (a) whole captions associated with a visually similar image, or (b) relevant bits of text (phrases) from a large collection of image + description pairs. In the case of (b), we develop optimization algorithms to merge the retrieved phrases into valid natural language sentences. The end result is two simple, but effective, methods for harnessing the power of big data to produce image captions that are altogether more general, relevant, and human-like than previous attempts.
引用
收藏
页码:46 / 59
页数:14
相关论文
共 50 条
  • [1] Large Scale Retrieval and Generation of Image Descriptions
    Vicente Ordonez
    Xufeng Han
    Polina Kuznetsova
    Girish Kulkarni
    Margaret Mitchell
    Kota Yamaguchi
    Karl Stratos
    Amit Goyal
    Jesse Dodge
    Alyssa Mensch
    Hal Daumé
    Alexander C. Berg
    Yejin Choi
    Tamara L. Berg
    International Journal of Computer Vision, 2016, 119 : 46 - 59
  • [2] Simhash for large scale image retrieval
    Guo, Qin-Zhen
    Zeng, Zhi
    Zhang, Shuwu
    Feng, Xiao
    Guan, Hu
    MATERIAL SCIENCE, CIVIL ENGINEERING AND ARCHITECTURE SCIENCE, MECHANICAL ENGINEERING AND MANUFACTURING TECHNOLOGY II, 2014, 651-653 : 2197 - 2200
  • [3] Large-Scale Image Retrieval with Elasticsearch
    Amato, Giuseppe
    Bolettieri, Paolo
    Carrara, Fabio
    Falchi, Fabrizio
    Gennaro, Claudio
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 925 - 928
  • [4] Three Components for Large Scale Image Retrieval
    Wang, Hai
    Zhang, Shuwu
    Liang, Wei
    PROCEEDINGS OF 2012 INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND SIGNAL PROCESSING, 2012, : 54 - 58
  • [5] LARGE SCALE IMAGE RETRIEVAL WITH VISUAL GROUPS
    Dai, Lican
    Sun, Xiaoyan
    Wu, Feng
    Yu, Nenghai
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 2582 - 2586
  • [6] Tensor index for large scale image retrieval
    Liang Zheng
    Shengjin Wang
    Peizhen Guo
    Hanyue Liang
    Qi Tian
    Multimedia Systems, 2015, 21 : 569 - 579
  • [7] Tensor index for large scale image retrieval
    Zheng, Liang
    Wang, Shengjin
    Guo, Peizhen
    Liang, Hanyue
    Tian, Qi
    MULTIMEDIA SYSTEMS, 2015, 21 (06) : 569 - 579
  • [8] Promising Large Scale Image Retrieval by using Intelligent Semantic Binary Code Generation Technique
    Khodaskar, Anuja
    Ladhake, Siddarth
    INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015), 2015, 48 : 282 - 287
  • [9] Probabilistic reverse annotation for large scale image retrieval
    Sankar, Pramod K.
    Jawahar, C. V.
    2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 1523 - +
  • [10] Hierarchical Semantic Indexing for Large Scale Image Retrieval
    Jia Deng
    Berg, Alexander C.
    Li Fei-Fei
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 785 - 792