SmallClient for big data: an indexing framework towards fast data retrieval

被引:15
|
作者
Siddiqa, Aisha [1 ]
Karim, Ahmad [2 ]
Chang, Victor [3 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
[2] Bahauddin Zakariya Univ, Dept Informat Technol, Multan 60000, Pakistan
[3] Xian Jiaotong Liverpool Univ, IBSS, Suzhou 100044, Peoples R China
关键词
Big data; Big data indexing; Big data retrieval; Big data analytics; Query execution; Data search performance; CLOUD; EFFICIENT; PERFORMANCE; TAXONOMY; STORAGE;
D O I
10.1007/s10586-016-0712-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Numerous applications are continuously generating massive amount of data and it has become critical to extract useful information while maintaining acceptable computing performance. The objective of this work is to design an indexing framework which minimizes indexing overhead and improves query execution and data search performance with optimum aggregation of computing performance. We propose SmallClient, an indexing framework to speed up query execution. SmallClient has three modules: block creation, index creation and query execution. Block creation module supports improving data retrieval performance with minimum data uploading overhead. Index creation module allows maximum indexes on a dataset to increase index hit ratio with minimized indexing overhead. Finally, query execution module offers incoming queries to utilize these indexes. The evaluation shows that SmallClient outperforms Hadoop full scan with more than 90% search performance. Meanwhile, indexing overhead of SmallClient is reduced to approximately 50 and 80% for index size and indexing time respectively.
引用
收藏
页码:1193 / 1208
页数:16
相关论文
共 50 条
  • [1] SmallClient for big data: an indexing framework towards fast data retrieval
    Aisha Siddiqa
    Ahmad Karim
    Victor Chang
    Cluster Computing, 2017, 20 : 1193 - 1208
  • [2] Modeling SmallClient indexing framework for big data analytics
    Aisha Siddiqa
    Ahmad Karim
    Victor Chang
    The Journal of Supercomputing, 2018, 74 : 5241 - 5262
  • [3] Modeling SmallClient indexing framework for big data analytics
    Siddiqa, Aisha
    Karim, Ahmad
    Chang, Victor
    JOURNAL OF SUPERCOMPUTING, 2018, 74 (10): : 5241 - 5262
  • [4] Optimization Driven MapReduce Framework for Indexing and Retrieval of Big Data
    Abdalla, Hemn Barzan
    Ahmed, Awder Mohammed
    Al Sibahee, M. A.
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (05): : 1886 - 1908
  • [5] Fast indexing and retrieval of color image data
    Gupte, AV
    Berkovich, SY
    CISST '04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGING SCIENCE, SYSTEMS, AND TECHNOLOGY, 2004, : 549 - 554
  • [6] Efficient indexing and retrieval of patient information from the big data using MapReduce framework and optimisation
    Merlin, N. R. Gladiss
    Prem, M. Vigilson
    JOURNAL OF INFORMATION SCIENCE, 2023, 49 (02) : 500 - 518
  • [7] Patterns of life in temporal data: indexing and hashing for fast and relevant data retrieval
    Jacobsen, Matthew
    Levchuk, Georgiy
    Weston, Mark
    Roberts, Jennifer
    MACHINE INTELLIGENCE AND BIO-INSPIRED COMPUTATION: THEORY AND APPLICATIONS VIII, 2014, 9119
  • [8] Indexing in Big Data
    Nashipudimath, Madhu M.
    Shinde, Subhash K.
    COMPUTING, COMMUNICATION AND SIGNAL PROCESSING, ICCASP 2018, 2019, 810 : 133 - 142
  • [9] Big Data, Big Gap: Working Towards a HIPAA Framework that Covers Big Data
    Mueller, Ryan
    INDIANA LAW JOURNAL, 2022, 97 (04) : 1505 - 1529
  • [10] TARDIS: Distributed Indexing Framework for Big Time Series Data
    Zhang, Liang
    Alghamdi, Noura
    Eltabakh, Mohamed Y.
    Rundensteiner, Elke A.
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1202 - 1213