Detecting Near-Duplicate Document Images using Interest Point Matching

被引:0
|
作者
Vitaladevuni, Shiv [1 ]
Choi, Fred [1 ]
Prasad, Rohit [1 ]
Natarajan, Premkumar [1 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
来源
2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012) | 2012年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an approach to detecting near-duplicate document images using SIFT interest point matching. Given a set of document images, a database is constructed from the SIFT features extracted from each image, stored as a kd-tree. The near-duplicates of a query image are estimated by directly matching its SIFT descriptors with the feature database. We demonstrate the approach on a challenging set of unconstrained Arabic hand and machine written images obtained from the field, consisting of 16,000+ documents. Our experiments indicate that the approach detects near-duplicates with low false alarm rate and outperforms bag-of-words based approach.
引用
收藏
页码:347 / 350
页数:4
相关论文
共 50 条
  • [1] Near-duplicate keyframe identification with interest point matching and pattern learning
    Zhao, Wan-Lei
    Ngo, Chong-Wah
    Tan, Hung-Khoon
    Wu, Xiao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (05) : 1037 - 1048
  • [2] Near-duplicate document image matching: A graphical perspective
    Liu, Li
    Lu, Yue
    Suen, Ching Y.
    PATTERN RECOGNITION, 2014, 47 (04) : 1653 - 1663
  • [3] Local Feature based Near-duplicate Images Detecting
    Liu, Zheng
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2011, 14 (03): : 957 - 962
  • [4] Novel Global and Local Features for Near-duplicate Document Image Matching
    Liu, Li
    Lu, Yue
    Suen, Ching Y.
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 4624 - 4629
  • [5] Efficient Near-Duplicate Document Detection using FPGAs
    Luo, Xi
    Najjar, Walid
    Hristidis, Vagelis
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [6] The study on Detecting near-Duplicate WebPages
    Cao, YuJuan
    Niu, ZhenDong
    Wang, WeiQiang
    Zhao, Kun
    2008 IEEE 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2008, : 95 - 100
  • [7] Detecting near-duplicate documents using paragraph features
    Wang, Haitao
    Liu, Shufen
    Jia, Zongpu
    Journal of Computational Information Systems, 2015, 11 (04): : 1295 - 1302
  • [8] Distinctive Interest Point Selection for Efficient Near-duplicate Image Retrieval
    Yildiz, Burak
    Demirci, M. Fatih
    2016 IEEE SOUTHWEST SYMPOSIUM ON IMAGE ANALYSIS AND INTERPRETATION (SSIAI), 2016, : 49 - 52
  • [9] Detecting Near-Duplicate SPITs in Voice Mailboxes Using Hashes
    Zhang, Ge
    Fischer-Hubner, Simone
    INFORMATION SECURITY, 2011, 7001 : 152 - 167
  • [10] Near-Duplicate Subsequence Matching for Video Streams
    Chiu, Chih-Yi
    Jhuang, Yi-Cheng
    Han, Guei-Wun
    Kang, Li-Wei
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,