SmallClient for big data: an indexing framework towards fast data retrieval

被引:15
|
作者
Siddiqa, Aisha [1 ]
Karim, Ahmad [2 ]
Chang, Victor [3 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
[2] Bahauddin Zakariya Univ, Dept Informat Technol, Multan 60000, Pakistan
[3] Xian Jiaotong Liverpool Univ, IBSS, Suzhou 100044, Peoples R China
关键词
Big data; Big data indexing; Big data retrieval; Big data analytics; Query execution; Data search performance; CLOUD; EFFICIENT; PERFORMANCE; TAXONOMY; STORAGE;
D O I
10.1007/s10586-016-0712-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Numerous applications are continuously generating massive amount of data and it has become critical to extract useful information while maintaining acceptable computing performance. The objective of this work is to design an indexing framework which minimizes indexing overhead and improves query execution and data search performance with optimum aggregation of computing performance. We propose SmallClient, an indexing framework to speed up query execution. SmallClient has three modules: block creation, index creation and query execution. Block creation module supports improving data retrieval performance with minimum data uploading overhead. Index creation module allows maximum indexes on a dataset to increase index hit ratio with minimized indexing overhead. Finally, query execution module offers incoming queries to utilize these indexes. The evaluation shows that SmallClient outperforms Hadoop full scan with more than 90% search performance. Meanwhile, indexing overhead of SmallClient is reduced to approximately 50 and 80% for index size and indexing time respectively.
引用
收藏
页码:1193 / 1208
页数:16
相关论文
共 50 条
  • [11] An efficient fast-response content-based image retrieval framework for big data
    Sakr, Noha A.
    ELdesouky, Ali. I.
    Arafat, Hesham
    COMPUTERS & ELECTRICAL ENGINEERING, 2016, 54 : 522 - 538
  • [12] Dissimilarity and Retrieval of Time-Varying Data Towards Big Data Analysis
    Hochin, Teruhisa
    2015 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2015, : 1 - 1
  • [13] Fast Storage and Indexing Method of Big Data in Forest Ecological Station
    Wang X.
    Jia X.
    Chen Z.
    Cui X.
    Xu F.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 (08): : 195 - 204and212
  • [14] Fast Access and Retrieval of Big Data Based on Unique Identification
    Sheng, Wenshun
    Xu, Aiping
    Wu, Shengli
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 32 (03): : 1780 - 1794
  • [15] A Fast Image Retrieval Method Designed for Network Big Data
    Yang, Jiachen
    Jiang, Bin
    Li, Baihua
    Tian, Kun
    Lv, Zhihan
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2017, 13 (05) : 2350 - 2359
  • [16] Big Data and security policies: Towards a framework for regulating the phases of analytics and use of Big Data
    Broeders, Dennis
    Schrijvers, Erik
    van der Sloot, Bart
    van Brakel, Rosamunde
    de Hoog, Josta
    Bailin, Ernst Hirsch
    COMPUTER LAW & SECURITY REVIEW, 2017, 33 (03) : 309 - 323
  • [17] Big Data Analytics Towards a Framework for a Smart City
    Srivastava, Devesh Kumar
    Singh, Ayush
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR INTELLIGENT SYSTEMS (ICTIS 2017) - VOL 1, 2018, 83 : 225 - 232
  • [18] Towards a Novel Framework for Automatic Big Data Detection
    Ahmed, Hameeza
    Ismail, Muhammad Ali
    IEEE ACCESS, 2020, 8 : 186304 - 186322
  • [19] Why Big Data? Towards a project assessment framework
    Portela, Filipe
    Lima, Luciana
    Santos, Manuel Filipe
    7TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN 2016)/THE 6TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2016), 2016, 98 : 604 - 609
  • [20] Towards a Big Data Exploration Framework for Astronomical Archives
    Sciacca, Eva
    Pistagna, Costantino
    Becciani, Ugo
    Costa, Alessandro
    Massimino, Piero
    Riggi, Simone
    Vitello, Fabio
    Bandieramonte, Marilena
    Krokos, Mel
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 351 - 357