A Comparative Study of Secondary Indexing Techniques in LSM-based NoSQL Databases

被引:26
|
作者
Qader, Mohiuddin Abdul [1 ]
Cheng, Shiwen [1 ]
Hristidis, Vagelis [1 ]
机构
[1] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA
关键词
D O I
10.1145/3183713.3196900
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
NoSQL databases are increasingly used in big data applications, because they achieve fast write throughput and fast lookups on the primary key. Many of these applications also require queries on non-primary attributes. For that reason, several NoSQL databases have added support for secondary indexes. However, these works are fragmented, as each system generally supports one type of secondary index, and may be using different names or no name at all to refer to such indexes. As there is no single system that supports all types of secondary indexes, no experimental head-to head comparison or performance analysis of the various secondary indexing techniques in terms of throughput and space exists. In this paper, we present a taxonomy of NoSQL secondary indexes, broadly split into two classes: Embedded Indexes (i.e. lightweight filters embedded inside the primary table) and Stand-Alone Indexes (i.e. separate data structures). To ensure the fairness of our comparative study, we built a system, LevelDB++, on top of Google's popular open-source LevelDB key-value store. There, we implemented two Embedded Indexes and three state-of-the-art Stand-Alone indexes, which cover most of the popular NoSQL databases. Our comprehensive experimental study and theoretical evaluation show that none of these indexing techniques dominate the others: the embedded indexes offer superior write throughput and are more space efficient, whereas the stand-alone secondary indexes achieve faster query response times. Thus, the optimal choice of secondary index depends on the application workload. This paper provides an empirical guideline for choosing secondary indexes.
引用
收藏
页码:551 / 566
页数:16
相关论文
共 50 条
  • [1] Perseid: A Secondary Indexing Mechanism for LSM-Based Storage Systems
    Wang, Jing
    Lu, Youyou
    Wang, Qing
    Zhang, Yuhao
    Shu, Jiwu
    ACM TRANSACTIONS ON STORAGE, 2024, 20 (02)
  • [2] Revisiting Secondary Indexing in LSM-based Storage Systems with Persistent Memory
    Wang, Jing
    Lu, Youyou
    Wang, Qing
    Zhang, Yuhao
    Shu, Jiwu
    PROCEEDINGS OF THE 2023 USENIX ANNUAL TECHNICAL CONFERENCE, 2023, : 817 - 832
  • [3] SineKV: Decoupled Secondary Indexing for LSM-based Key-Value Stores
    Li, Fei
    Lu, Youyou
    Yang, Zhe
    Shu, Jiwu
    2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2020, : 1112 - 1122
  • [4] LSM-based storage techniques: a survey
    Chen Luo
    Michael J. Carey
    The VLDB Journal, 2020, 29 : 393 - 418
  • [5] LSM-based storage techniques: a survey
    Luo, Chen
    Carey, Michael J.
    VLDB JOURNAL, 2020, 29 (01): : 393 - 418
  • [6] Hailstorm: Disaggregated Compute and Storage for Distributed LSM-based Databases
    Bindschaedler, Laurent
    Goel, Ashvin
    Zwaenepoel, Willy
    TWENTY-FIFTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXV), 2020, : 301 - 316
  • [7] A Comparative Study of NoSQL Databases
    Aqel, Musbah J.
    Al-Sakran, Aya
    Hunaity, Mohammad
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2019, 12 (01): : 17 - 26
  • [8] Bulk Loading of the Secondary Index in LSM-Based Stores for Flash Memory
    Macyna, Wojciech
    Kukowski, Michal
    NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 1652 : 133 - 143
  • [9] Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores
    Macyna, Wojciech
    Kukowski, Michal
    Zwarzko, Michal
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2023, PT II, 2023, 14147 : 245 - 257
  • [10] The study of indexing techniques on object oriented databases
    Huang, YF
    Chen, JM
    INFORMATION SCIENCES, 2000, 130 (1-4) : 109 - 131