Efficient Indexing of Similarity Models with Inequality Symbolic Regression

被引:0
|
作者
Bartos, Tomas [1 ]
Skopal, Tomas [1 ]
Mosko, Juraj [1 ]
机构
[1] Charles Univ Prague, Fac Math & Phys, SIRET Grp, Prague, Czech Republic
关键词
Genetic programming; Similarity research; Content based retrieval;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing amount of available unstructured content introduced a new concept of searching for information - the content-based retrieval. The principle behind is that the objects are compared based on their content which is far more complex than simple text or metadata based searching. Many indexing techniques arose to provide an efficient and effective similarity searching. However, these methods are restricted to a specific domain such as the metric space model. If this prerequisite is not fulfilled, indexing cannot be used, while each similarity search query degrades to sequential scanning which is unacceptable for large datasets. Inspired by previous successful results, we decided to apply the principles of genetic programming to the area of database indexing. We developed the GP-SIMDEX which is a universal framework that is capable of finding precise and efficient indexing methods for similarity searching for any given similarity data. For this purpose, we introduce the inequality symbolic regression principle and show how it helps the GP-SIMDEX Framework to find appropriate results that in most, cases outperform the best-known indexing methods.
引用
收藏
页码:901 / 908
页数:8
相关论文
共 50 条
  • [41] Symbolic Regression for the Estimation of Transfer Functions of Hydrological Models
    Klotz, D.
    Herrnegger, M.
    Schulz, K.
    WATER RESOURCES RESEARCH, 2017, 53 (11) : 9402 - 9423
  • [42] Optimal Indexing: An Efficient Feature-Based Indexing Framework for Similarity Data Sharing at the Network Edge
    Sun, Yuchen
    Luo, Lailong
    Guo, Deke
    Liu, Li
    Ren, Bangbang
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024,
  • [43] A novel indexing approach for efficient and fast similarity search of captured motions
    Li, Chuanjun
    Prabhakaran, B.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 689 - 698
  • [44] Indexing expensive functions for efficient multi-dimensional similarity search
    Chen, Hanxiong
    Liu, Jianquan
    Furuse, Kazutaka
    Yu, Jeffrey Xu
    Ohbo, Nobuo
    KNOWLEDGE AND INFORMATION SYSTEMS, 2011, 27 (02) : 165 - 192
  • [45] An Efficient Document Indexing-Based Similarity Search in Large Datasets
    Trong Nhan Phan
    Jaeger, Markus
    Nadschlaeger, Stefan
    Kueng, Josef
    Tran Khanh Dang
    FUTURE DATA AND SECURITY ENGINEERING, FDSE 2015, 2015, 9446 : 16 - 31
  • [46] Indexing expensive functions for efficient multi-dimensional similarity search
    Hanxiong Chen
    Jianquan Liu
    Kazutaka Furuse
    Jeffrey Xu Yu
    Nobuo Ohbo
    Knowledge and Information Systems, 2011, 27 : 165 - 192
  • [47] Indexing of Spatiotemporal Trajectories for Efficient Distance Threshold Similarity Searches on the GPU
    Gowanlock, Michael
    Casanova, Henri
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2015, : 387 - 396
  • [48] Combination of similarity measures based on symbolic regression for confusing drug names identification
    Vazquez, Eder Vazquez
    Ledeneva, Yulia
    Garcia-Hernandez, Rene Arnulfo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2093 - 2103
  • [49] Abstraction by symbolic indexing transformations
    Melham, TF
    Jones, RB
    FORMAL METHODS IN COMPUTER-AIDED DESIGN, PROCEEDINGS, 2002, 2517 : 1 - 18
  • [50] Efficient Approaches to Interleaved Sampling of training data for Symbolic Regression
    Azad, R. Muhammad Atif
    Medernach, David
    Ryan, Conor
    2014 SIXTH WORLD CONGRESS ON NATURE AND BIOLOGICALLY INSPIRED COMPUTING (NABIC), 2014, : 176 - 183