Large Scale Retrieval and Generation of Image Descriptions

被引:48
|
作者
Ordonez, Vicente [1 ]
Han, Xufeng [1 ]
Kuznetsova, Polina [2 ]
Kulkarni, Girish [2 ]
Mitchell, Margaret [3 ]
Yamaguchi, Kota [4 ]
Stratos, Karl [5 ]
Goyal, Amit [6 ]
Dodge, Jesse [7 ]
Mensch, Alyssa [8 ]
Daume, Hal, III [9 ]
Berg, Alexander C. [1 ]
Choi, Yejin [10 ]
Berg, Tamara L. [1 ]
机构
[1] Univ N Carolina, Chapel Hill, NC 27599 USA
[2] SUNY Stony Brook, Stony Brook, NY 11794 USA
[3] Microsoft Res, Redmond, WA USA
[4] Tohoku Univ, Sendai, Miyagi, Japan
[5] Columbia Univ, New York, NY USA
[6] Yahoo Labs, Sunnyvale, CA USA
[7] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[8] Univ Penn, Philadelphia, PA 19104 USA
[9] Univ Maryland, College Pk, MD 20742 USA
[10] Univ Washington, Seattle, WA 98195 USA
关键词
Retrieval; Image description; Data driven; Big data; Natural language processing; SCENE;
D O I
10.1007/s11263-015-0840-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
What is the story of an image? What is the relationship between pictures, language, and information we can extract using state of the art computational recognition systems? In an attempt to address both of these questions, we explore methods for retrieving and generating natural language descriptions for images. Ideally, we would like our generated textual descriptions (captions) to both sound like a person wrote them, and also remain true to the image content. To do this we develop data-driven approaches for image description generation, using retrieval-based techniques to gather either: (a) whole captions associated with a visually similar image, or (b) relevant bits of text (phrases) from a large collection of image + description pairs. In the case of (b), we develop optimization algorithms to merge the retrieved phrases into valid natural language sentences. The end result is two simple, but effective, methods for harnessing the power of big data to produce image captions that are altogether more general, relevant, and human-like than previous attempts.
引用
收藏
页码:46 / 59
页数:14
相关论文
共 50 条
  • [31] ImageProof: Enabling Authentication for Large-Scale Image Retrieval
    Guo, Shangwei
    Xu, Jianliang
    Zhang, Ce
    Xu, Cheng
    Xiang, Tao
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1070 - 1081
  • [32] Large scale document image retrieval by automatic word annotation
    Sankar, K. Pramod
    Manmatha, R.
    Jawahar, C. V.
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2014, 17 (01) : 1 - 17
  • [33] Large-scale image retrieval with supervised sparse hashing
    Xu, Yan
    Shen, Fumin
    Xu, Xing
    Gao, Lianli
    Wang, Yuan
    Tan, Xiao
    NEUROCOMPUTING, 2017, 229 : 45 - 53
  • [34] Large-scale image retrieval with Sparse Embedded Hashing
    Ding, Guiguang
    Zhou, Jile
    Guo, Yuchen
    Lin, Zijia
    Zhao, Sicheng
    Han, Jungong
    NEUROCOMPUTING, 2017, 257 : 24 - 36
  • [35] Deep Product Quantization for Large-Scale Image Retrieval
    Zhai, Qi
    Jiang, Mingyan
    2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019), 2019, : 198 - 202
  • [36] A Framework for the Revision of Large-Scale Image Retrieval Benchmarks
    Hassan, Muhammad Umair
    Shohag, Md Shakil Ahamed
    Niu, Dongmei
    Shaukat, Kamran
    Zhang, Mingxuan
    Zhao, Wenshuang
    Zhao, Xiuyang
    ELEVENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2019), 2019, 11179
  • [37] Adaptive relevance feedback for large-scale image retrieval
    Nicolae Suditu
    François Fleuret
    Multimedia Tools and Applications, 2016, 75 : 6777 - 6807
  • [38] Deep semantic preserving hashing for large scale image retrieval
    Zareapoor, Masoumeh
    Yang, Jie
    Jain, Deepak Kumar
    Shamsolmoali, Pourya
    Jain, Neha
    Kant, Surya
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (17) : 23831 - 23846
  • [39] Deep semantic preserving hashing for large scale image retrieval
    Masoumeh Zareapoor
    Jie Yang
    Deepak Kumar Jain
    Pourya Shamsolmoali
    Neha Jain
    Surya Kant
    Multimedia Tools and Applications, 2019, 78 : 23831 - 23846
  • [40] Large scale document image retrieval by automatic word annotation
    K. Pramod Sankar
    R. Manmatha
    C. V. Jawahar
    International Journal on Document Analysis and Recognition (IJDAR), 2014, 17 : 1 - 17