Searching for supermaximal repeats in large DNA sequences

被引：0

作者：

Lian, Chen Na ^{[1
]}

Halachev, Mihail ^{[1
]}

Shiri, Nematollaah ^{[1
]}

机构：

[1] Concordia Univ, Dept Comp Sci & Software Engn, Montreal, PQ, Canada

来源：

BIOINFORMATICS RESEARCH AND DEVELOPMENT, PROCEEDINGS | 2008年 / 13卷

关键词：

DNA sequences; supermaximal repeats; suffix tree; performance;

D O I：

暂无

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

We study the problem of finding supermaximal repeats in large DNA sequences. For this, we propose an algorithm called SMR which uses an auxiliary index structure (POL), which is derived from and replaces the suffix tree index ST-FD64 [1]. The results of our numerous experiments using the 24 human chromosomes data indicate that SMR outperforms the solution provided as part of the Vmatch [2] software tool. In searching for supermaximal repeats of size at least 10 bases, SMR is twice faster than Vmatch; for a minimum length of 25 bases, SMR is 7 times faster; and for repeats of length at least 200, SMR is about 9 times faster. We also study the cost of POL in terms of time and space requirements.

引用

页码：87 / 101

页数：15

共 50 条

[41] Searching microsatellites in DNA sequences: Approaches used and tools developed
Grover A.
Aishwarya V.
Sharma P.C.
Physiology and Molecular Biology of Plants, 2012, 18 (1) : 11 - 19
[42] Searching for unique DNA sequences with the Burrows-Wheeler Transform
Pokrzywa, Rafal
BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2008, 28 (01) : 95 - 104
[43] Exploring the Role of Large Tandem DNA Repeats in the Context of Regeneration
Barreira, S. N.
Baxevanis, A. D.
INTEGRATIVE AND COMPARATIVE BIOLOGY, 2018, 58 : E273 - E273
[44] RepEx: A web server to extract sequence repeats from protein and DNA sequences
Michael, Daliah
Gurusaran, M.
Santhosh, R.
Hussain, Md. Khaja
Satheesh, S. N.
Suhan, S.
Sivaranjan, P.
Jaiswal, Akanksha
Sekar, K.
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2019, 78 : 424 - 430
[45] IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences
Alamro, Hayam
Alzamel, Mai
Iliopoulos, Costas S.
Pissis, Solon P.
Watts, Steven
BMC BIOINFORMATICS, 2021, 22 (01)
[46] Searching of Gapped Repeats and Subrepetitions in a Word
Kolpakov, Roman
Podolskiy, Mikhail
Posypkin, Mikhail
Khrapov, Nickolay
COMBINATORIAL PATTERN MATCHING, CPM 2014, 2014, 8486 : 212 - 221
[47] Sequence analysis by additive scales:: DNA structure for sequences and repeats of all lengths
Baldi, P
Baisnée, PF
BIOINFORMATICS, 2000, 16 (10) : 865 - 889
[48] IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences
Hayam Alamro
Mai Alzamel
Costas S. Iliopoulos
Solon P. Pissis
Steven Watts
BMC Bioinformatics, 22
[49] Distributions of dimeric tandem repeats in non-coding and coding DNA sequences
Dokholyan, NV
Buldyrev, SV
Havlin, S
Stanley, HE
JOURNAL OF THEORETICAL BIOLOGY, 2000, 202 (04) : 273 - 282
[50] MATCH™:: a tool for searching transcription factor binding sites in DNA sequences
Kel, AE
Gössling, E
Reuter, I
Cheremushkin, E
Kel-Margoulis, OV
Wingender, E
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) : 3576 - 3579

← 1 2 3 4 5 →