Efficient batch similarity join processing of social images based on arbitrary features

被引:0
|
作者
Yi Zhuang
Nan Jiang
Zhi-Ang Wu
Jie Cao
Chunhua Ju
机构
[1] Zhejiang Gongshang University,College of Computer and Information Engineering
[2] Hangzhou First People’s Hospital,Jiangsu Provincial Key Laboratory of E
[3] Nanjing University of Finance and Economics,Business
来源
World Wide Web | 2016年 / 19卷
关键词
Social image; High-dimensional indexing; Join box; Batch similarity join;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we identify and solve a multi-join optimization problem for Arbitrary Feature-based social image Similarity JOINs(AFS-JOIN). Given two collections(i.e., R and S) of social images that carry both visual, spatial and textual(i.e., tag) information, the multiple joins based on arbitrary features retrieves the pairs of images that are visually, textually similar or spatially close from different users. To address this problem, in this paper, we have proposed three methods to facilitate the multi-join processing: 1) two baseline approaches(i.e., a naïve join approach and a maximal threshold(MT)-based), and 2) aBatch Similarity Join(BSJ) method. For the BSJ method, given m users’ join requests, they are first conversed and grouped into m″ clusters which correspond to m″ join boxes, where m > m″. To speedup the BSJ processing, a feature distance space is first partitioned into some cubes based on four segmentation schemes; the image pairs falling in the cubes are indexed by the cube tree index; thus BSJ processing is transformed into the searching of the image pairs falling in some affected cubes for m″ AFS-JOINs with the aid of the index. An extensive experimental evaluation using real and synthetic datasets shows that our proposed BSJ technique outperforms the state-of-the-art solutions.
引用
收藏
页码:725 / 753
页数:28
相关论文
共 50 条
  • [1] Efficient batch similarity join processing of social images based on arbitrary features
    Zhuang, Yi
    Jiang, Nan
    Wu, Zhi-Ang
    Cao, Jie
    Ju, Chunhua
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (04): : 725 - 753
  • [2] Efficient and Scalable Processing of String Similarity Join
    Rong, Chuitian
    Lu, Wei
    Wang, Xiaoli
    Du, Xiaoyong
    Chen, Yueguo
    Tung, Anthony K. H.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (10) : 2217 - 2230
  • [3] An Efficient Batch Similarity Processing with MapReduce
    Trong Nhan Phan
    Tran Khanh Dang
    FUTURE DATA AND SECURITY ENGINEERING, FDSE 2018, 2018, 11251 : 158 - 171
  • [4] Efficient subgraph join based on connectivity similarity
    Wang, Yue
    Wang, Hongzhi
    Li, Jianzhong
    Gao, Hong
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2015, 18 (04): : 871 - 887
  • [5] Efficient SimRank-Based Similarity Join
    Zheng, Weiguo
    Zou, Lei
    Chen, Lei
    Zhao, Dongyan
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2017, 42 (03):
  • [6] Efficient subgraph join based on connectivity similarity
    Yue Wang
    Hongzhi Wang
    Jianzhong Li
    Hong Gao
    World Wide Web, 2015, 18 : 871 - 887
  • [7] Efficient Spatio-Textual Similarity Join Processing on NUMA Systems
    Gautam, Saransh
    Ray, Suprio
    Nickerson, Bradford G.
    2021 22ND IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2021), 2021, : 79 - 88
  • [8] A Lightweight Indexing Approach for Efficient Batch Similarity Processing with MapReduce
    Phan T.N.
    Dang T.K.
    SN Computer Science, 2020, 1 (1)
  • [9] A Spark Join Algorithm Based on Memory Monitoring and Batch Processing
    Cheng Kefei
    Luo Zhao
    Zhou Ke
    Deng Xianjun
    Chen Xudong
    PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 1096 - 1103
  • [10] Fast-join: An efficient method for fuzzy token matching based string similarity join
    Wang, Jiannan
    Li, Guoliang
    Fe, Jianhua
    Proceedings - International Conference on Data Engineering, 2011, : 458 - 469