Cross-media retrieval based on locality-sensitive hashing and neural network algorithms

被引:0
|
作者
Bai L. [1 ]
Jia Y. [1 ]
Wang H. [1 ]
Xie Y. [1 ]
Yu T. [1 ]
机构
[1] College of Systems Engineering, National University of Defense Technology, Changsha
来源
| 2018年 / National University of Defense Technology卷 / 40期
关键词
Cross-media retrieval; Locality-sensitive hashing algorithm; Multimodal data indexing; Neural network algorithm;
D O I
10.11887/j.cn.201801014
中图分类号
G252.7 [文献检索]; G354 [情报检索];
学科分类号
摘要
To efficiently retrieve in multimodal data, it is essential to reduce the proportion of irrelevant documents. The image data were projected to the Hamming space by using the locality-sensitive hashing algorithm, the text data were mapped on the hashing function of Hamming space by employing the neural network learning, and then a novel cross-media retrieval approach was proposed to reduce the proportion of irrelevant documents. The experiment shows that the proportion of the relevant documents can be much improved in the proposed method. Assessments on the two public datasets also demonstrate the efficacy and the accuracy of the proposed retrieval method when compared to the baselines. © 2018, NUDT Press. All right reserved.
引用
收藏
页码:93 / 98
页数:5
相关论文
共 15 条
  • [1] Blei D.M., Ng A.Y., Jordan M.I., Latent dirichlet allocation, Journal of Machine Learning Research, 3, pp. 993-1022, (2003)
  • [2] Blei D.M., Jordan M.I., Modeling annotated data, Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 127-134, (2003)
  • [3] Jia Y., Salzmann M., Darrell T., Learning cross modality similarity for multinomial data, Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2407-2414, (2011)
  • [4] Bian J., Yang Y., Zhang H.W., Et al., Multimedia summarization for social events in microblog stream, IEEE Transactions on Multimedia, 17, 2, pp. 216-228, (2015)
  • [5] Hardoon D.R., Szedmak S., Shawe-Taylor J., Canonical correlation analysis: an overview with application to learning methods, Neural Computation, 16, 12, pp. 2639-2664, (2004)
  • [6] Ranjan V., Rasiwasia N., Jawahar C.V., Multi-label cross-modal retrieval, Proceedings of IEEE International Conference on Computer Vision, pp. 4094-4102, (2015)
  • [7] Gong Y.C., Ke Q.D., Isard M., Et al., A multi-view embedding space for modeling internet images, tags, and their semantics, International Journal of Computer Vision, 106, 2, pp. 210-233, (2014)
  • [8] Wu F., Lu X.Y., Zhang Z.F., Et al., Cross-media semantic representation via bi-directional learning to rank, Proceedings of the 21st ACM International Conference on Multimedia, pp. 877-886, (2013)
  • [9] Yan F., Mikolajczyk K., Deep correlation for matching images and text, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3441-3450, (2015)
  • [10] Yu T.Y., Bai L., Guo J.L., Et al., A deep two-stream network for bidirectional cross-media information retrieval, Proceedings of Pacific Rim Conference on Multimedia, pp. 328-337, (2016)