RF: A method for filtering short reads with tandem repeats for genome mapping

被引:7
|
作者
Misawa, Kazuharu [1 ]
机构
[1] RIKEN, Res Program Computat Sci, Res & Dev Grp Next Generat Integrated Living Matt, Fus Data & Anal Res & Dev Team, Yokohama, Kanagawa 2300045, Japan
关键词
Tandem repeats; Human genome; Mapping; Next-generation sequencing; REPETITIVE DNA; ALIGNMENT; PARAMETERS; ELEMENTS;
D O I
10.1016/j.ygeno.2013.03.002
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Next-generation sequencing platforms generate short (50-150 bp) reads that can be mapped onto the reference genome. Repetitive sequences in the genome, because of the presence of similar or identical sequences, cause mapping errors in the case of the short reads. By filtering short reads with repeats, mapping will be improved. I developed RF. RF is a new method that filters short reads with tandem repeats. A scoring scheme was developed that assigned higher scores to regions with tandem repeats and lower scores to regions without tandem repeats. In this study, IF was applied to filter out short reads with repeats, before short reads were mapped onto the same genomic contig by using a short read-mapping program. The result suggests RF improved the proportion of correctly mapped short reads on filtering the repeats. RF is a useful tool for reducing mapping errors of short reads onto reference genomes. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:35 / 37
页数:3
相关论文
共 50 条
  • [41] STRScan: targeted profiling of short tandem repeats in whole-genome sequencing data
    Haixu Tang
    Etienne Nzabarushimana
    BMC Bioinformatics, 18
  • [42] STRScan: targeted profiling of short tandem repeats in whole-genome sequencing data
    Tang, Haixu
    Nzabarushimana, Etienne
    BMC BIOINFORMATICS, 2017, 18
  • [43] One is not enough: On the effects of reference genome for the mapping and subsequent analyses of short-reads
    Valiente-Mullor, Carlos
    Beamud, Beatriz
    Ansari, Ivan
    Frances-Cuesta, Carlos
    Garcia-Gonzalez, Neris
    Mejia, Lorena
    Ruiz-Hueso, Paula
    Gonzalez-Candelas, Fernando
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (01)
  • [44] Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data
    San-Xu Liu
    Wei Hou
    Xue-Yan Zhang
    Chang-Jun Peng
    Bi-Song Yue
    Zhen-Xin Fan
    Jing Li
    Zoological Research, 2018, 39 (04) : 291 - 300
  • [45] Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data
    Liu, San-Xu
    Hou, Wei
    Zhang, Xue-Yan
    Peng, Chang-Jun
    Yue, Bi-Song
    Fan, Zhen-Xin
    Li, Jing
    ZOOLOGICAL RESEARCH, 2018, 39 (04) : 291 - 300
  • [46] Replication slippage versus point mutation rates in short tandem repeats of the human genome
    Pumpernik, Danilo
    Oblak, Borut
    Borstnik, Branko
    MOLECULAR GENETICS AND GENOMICS, 2008, 279 (01) : 53 - 61
  • [47] Detecting short tandem repeats from genome data: opening the software black box
    Merkel, Angelika
    Gemmell, Neil
    BRIEFINGS IN BIOINFORMATICS, 2008, 9 (05) : 355 - 366
  • [48] Detection of tandem repeats in the Capsicum annuum genome
    Rudenko, Valentina
    Korotkov, Eugene
    DNA RESEARCH, 2023, 30 (03)
  • [49] Characterization and visualization of tandem repeats at genome scale
    Dolzhenko, Egor
    English, Adam
    Dashnow, Harriet
    Brandine, Guilherme De Sena
    Mokveld, Tom
    Rowell, William J.
    Karniski, Caitlin
    Kronenberg, Zev
    Danzi, Matt C.
    Cheung, Warren A.
    Bi, Chengpeng
    Farrow, Emily
    Wenger, Aaron
    Chua, Khi Pin
    Martinez-Cerdeno, Veronica
    Bartley, Trevor D.
    Jin, Peng
    Nelson, David L.
    Zuchner, Stephan
    Pastinen, Tomi
    Quinlan, Aaron R.
    Sedlazeck, Fritz J.
    Eberle, Michael A.
    NATURE BIOTECHNOLOGY, 2024, 42 (10) : 1606 - +
  • [50] Database of exact tandem repeats in the Zebrafish genome
    Rouchka, Eric C.
    BMC GENOMICS, 2010, 11