SmallClient for big data: an indexing framework towards fast data retrieval

被引:15
|
作者
Siddiqa, Aisha [1 ]
Karim, Ahmad [2 ]
Chang, Victor [3 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
[2] Bahauddin Zakariya Univ, Dept Informat Technol, Multan 60000, Pakistan
[3] Xian Jiaotong Liverpool Univ, IBSS, Suzhou 100044, Peoples R China
关键词
Big data; Big data indexing; Big data retrieval; Big data analytics; Query execution; Data search performance; CLOUD; EFFICIENT; PERFORMANCE; TAXONOMY; STORAGE;
D O I
10.1007/s10586-016-0712-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Numerous applications are continuously generating massive amount of data and it has become critical to extract useful information while maintaining acceptable computing performance. The objective of this work is to design an indexing framework which minimizes indexing overhead and improves query execution and data search performance with optimum aggregation of computing performance. We propose SmallClient, an indexing framework to speed up query execution. SmallClient has three modules: block creation, index creation and query execution. Block creation module supports improving data retrieval performance with minimum data uploading overhead. Index creation module allows maximum indexes on a dataset to increase index hit ratio with minimized indexing overhead. Finally, query execution module offers incoming queries to utilize these indexes. The evaluation shows that SmallClient outperforms Hadoop full scan with more than 90% search performance. Meanwhile, indexing overhead of SmallClient is reduced to approximately 50 and 80% for index size and indexing time respectively.
引用
收藏
页码:1193 / 1208
页数:16
相关论文
共 50 条
  • [21] INDEXING IN DATA-RETRIEVAL SYSTEMS
    KRISTALNYJ, BV
    VOJSKUNSKIJ, VG
    USTINOVA, ZS
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1978, (09): : 27 - 29
  • [22] An Indexing Scheme for Telerehabilitation Big Data
    Qamar, Ahmad Muaz
    Omar, Mohd Adib
    Rashid, Muhammad
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5807 - 5809
  • [23] Big Data retrieval techniques based on Hash Indexing and MapReduce approach with NoSQL Database
    Gayathiri, N. R.
    Jaspher, David D.
    Natarajan, A. M.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATION ENGINEERING (ICACCE-2019), 2019,
  • [24] A Highly Distributable Computational Framework for Fast Cloud Data Retrieval
    Basirat, Amir H.
    Khan, Asad I.
    Srinivasan, Bala
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 246 - 250
  • [25] Fast data series indexing for in-memory data
    Botao Peng
    Panagiota Fatourou
    Themis Palpanas
    The VLDB Journal, 2021, 30 : 1041 - 1067
  • [26] Fast data series indexing for in-memory data
    Peng, Botao
    Fatourou, Panagiota
    Palpanas, Themis
    VLDB JOURNAL, 2021, 30 (06): : 1041 - 1067
  • [27] A FAST BIG DATA COLLECTION SYSTEM USING MAPREDUCE FRAMEWORK
    Bing, Li
    Chan, Keith C. C.
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 530 - 535
  • [28] A Multimedia Big Data Retrieval Framework to Detect Dyslexia Among Children
    Hassanain, Elham
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 3857 - 3860
  • [29] Big Data Framework
    Tekiner, Firat
    Keane, John A.
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 1494 - 1499
  • [30] Towards a Conceptual Framework for Customer Intelligence in the Era of Big Data
    Nguyen Anh Khoa Dam
    Thang Le Dinh
    Menvielle, William
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2021, 17 (04)