A method to improve full-text search performance of MongoDB

被引:1
|
作者
Mesut, Altan [1 ]
Ozturk, Emir [1 ]
机构
[1] Trakya Univ, Engn Fac, Dept Comp Engn, Edirne, Turkey
关键词
NoSQL; MongoDB; Text index; Full-Text search; MWCA;
D O I
10.5505/pajes.2021.89590
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
B-Tree based text indexes used in MongoDB are slow compared to different structures such as inverted indexes. In this study, it has been shown that the full-text search speed can be increased significantly by indexing a structure in which each different word in the text is included only once. The Multi-Stream Word-Based Compression Algorithm (MWCA), developed in our previous work, stores word dictionaries and data in different streams. While adding the documents to a MongoDB collection, they were encoded with MWCA and separated into six different streams. Each stream was stored in a different field, and three of them containing unique words were used when creating a text index. In this way, the index could be created in a shorter time and took up less space. It was also seen that Snappy and Zlib block compression methods used by MongoDB reached higher compression ratios on data encoded with MWCA. Search tests on text indexes created on collections using different compression options shows that our method provides 19 to 146 times speed increase and 34% to 40% less memory usage. Tests on regex searches that do not use the text index also shows that the MWCA model provides 7 to 13 times speed increase and 29% to 34% less memory usage.
引用
收藏
页码:720 / 729
页数:10
相关论文
共 50 条
  • [42] Integrating expert system with a full-text search to solve growers' problems
    Elsayed, Abdelrahman
    Hazman, Maryam
    Ellakwa, Susan F.
    2019 15TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO 2019), 2019, : 192 - 197
  • [43] Hardware Accelerator for Full-Text Search (HAFTS) with Succinct Data Structure
    Tanida, Naoki
    Inaba, Mary
    Hiraki, Kei
    Yoshino, Takeshi
    2009 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS, 2009, : 155 - +
  • [44] LiB: An undergraduate thesis digital library based on full-text search
    Nery, Lyndemberg Batista
    de Freitas Neto, Francisco Paulo
    Moreira, Diogo Dantas
    PROCEEDINGS OF THE 10TH EURO-AMERICAN CONFERENCE ON TELEMATICS AND INFORMATION SYSTEMS (EATIS 2020), 2020,
  • [45] FULL-TEXT AND BIBLIOGRAPHIC DATABASES
    TENOPIR, C
    LIBRARY JOURNAL, 1985, 110 (19) : 62 - 63
  • [46] FULL-TEXT INFORMATION RETRIEVAL
    FAY, RJ
    LAW LIBRARY JOURNAL, 1971, 64 (02): : 167 - 175
  • [47] FULL-TEXT DATABASES IN MEDICINE
    SIEVERT, MC
    MCKININ, EJ
    JOHNSON, ED
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1995, 46 (10): : 748 - 754
  • [48] Redis-based full-text search extensions for relational databases
    Liao, Xuehua
    Peng, Lilan
    Yang, Ting
    Li, Tianrui
    Zhu, Zhousen
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (10) : 4475 - 4491
  • [49] Northern light: New search engine for the web and full-text articles
    Notess, G
    DATABASE, 1998, 21 (01): : 32 - 37
  • [50] Enhancing XML search with XQuery 1.0 and XPath 2.0 full-text
    Case, P
    IBM SYSTEMS JOURNAL, 2006, 45 (02) : 353 - 360