Gerbil: A Fast and Memory-Efficient k-mer Counter with GPU-Support

被引:1
|
作者
Erbert, Marius [1 ]
Rechner, Steffen [1 ]
Mueller-Hannemann, Matthias [1 ]
机构
[1] Univ Halle Wittenberg, Inst Comp Sci, Halle, Germany
来源
ALGORITHMS IN BIOINFORMATICS | 2016年 / 9838卷
关键词
D O I
10.1007/978-3-319-43681-4_12
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A basic task in bioinformatics is the counting of k-mers in genome strings. The k-mer counting problem is to build a histogram of all substrings of length k in a given genome sequence. We present the open source k-mer counting software Gerbil that has been designed for the efficient counting of k-mers for k >= 32. Given the technology trend towards long reads of next-generation sequencers, support for large k becomes increasingly important. While existing k-mer counting tools suffer from excessive memory resource consumption or degrading performance for large k, Gerbil is able to efficiently support large k without much loss of performance. Our software implements a two-disk approach. In the first step, DNA reads are loaded from disk and distributed to temporary files that are stored at a working disk. In a second step, the temporary files are read again, split into k-mers and counted via a hash table approach. In addition, Gerbil can optionally use GPUs to accelerate the counting step. For large k, we outperform state-of-the-art open source k-mer counting tools by up to a factor of 4 for large genome data sets.
引用
收藏
页码:150 / 161
页数:12
相关论文
共 50 条
  • [31] A novel data structure to support ultra-fast taxonomic classification of metagenomic sequences with k-mer signatures
    Liu, Xinan
    Yu, Ye
    Liu, Jinpeng
    Elliott, Corrine F.
    Qian, Chen
    Liu, Jinze
    BIOINFORMATICS, 2018, 34 (01) : 171 - 178
  • [32] Space-efficient representation of genomic k-mer count tables
    Shibuya, Yoshihiro
    Belazzougui, Djamal
    Kucherov, Gregory
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2022, 17 (01)
  • [33] Space-efficient representation of genomic k-mer count tables
    Yoshihiro Shibuya
    Djamal Belazzougui
    Gregory Kucherov
    Algorithms for Molecular Biology, 17
  • [34] Fast, Memory-Efficient Construction of Voxelized Shadows
    Kampe, Viktor
    Sintorn, Erik
    Assarsson, Ulf
    PROCEEDINGS - I3D 2015, 2015, : 25 - 30
  • [35] Using Image Morphing for Memory-Efficient Impostor Rendering on GPU
    Yuksel, Kamer Ali
    Ercil, Aytul
    Yucebilgin, Alp
    Balcisoy, Selim
    2011 INTERNATIONAL CONFERENCE ON CYBERWORLDS, 2011, : 197 - 202
  • [36] Fast, Memory-Efficient Construction of Voxelized Shadows
    Kampe, Viktor
    Sintorn, Erik
    Dolonius, Dan
    Assarsson, Ulf
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2016, 22 (10) : 2239 - 2248
  • [37] Memory-efficient and fast enumeration of global states
    Andrzejak, A
    SEVENTH INTERNATIONAL CONFERENCE ON INFORMATION VISUALIZATION, PROCEEDINGS, 2003, : 189 - 193
  • [38] Memory-efficient state lookups with fast updates
    Sikka, S
    Varghese, G
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2000, 30 (04) : 335 - 347
  • [39] REINDEER: efficient indexing of k-mer presence and abundance in sequencing datasets
    Marchet, Camille
    Iqbal, Zamin
    Gautheret, Daniel
    Salson, Mikael
    Chikhi, Rayan
    BIOINFORMATICS, 2020, 36 : 177 - 185
  • [40] Fast and Memory-Efficient Neural Code Completion
    Svyatkovskiy, Alexey
    Lee, Sebastian
    Hadjitofi, Anna
    Riechert, Maik
    Franco, Juliana Vicente
    Allamanis, Miltiadis
    2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021), 2021, : 329 - 340