Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in Saccharomyces cerevisiae

被引:14
|
作者
Luan, Mei-Wei [1 ]
Zhang, Xiao-Ming [2 ]
Zhu, Zi-Bin [1 ]
Chen, Ying [3 ]
Xie, Shang-Qian [1 ]
机构
[1] Hainan Univ, Key Lab Genet & Germplasm Innovat Trop Special Fo, Minist Educ,Coll Forestry, Hainan Key Lab Biol Trop Ornamental Plant Germpla, Haikou, Hainan, Peoples R China
[2] Inner Mongolia Agr Univ, Coll Grassland Resources & Environm, Hohhot, Peoples R China
[3] Sun Yat Sen Univ, Zhongshan Ophthalm Ctr, State Key Lab Ophthalmol, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
structural variation; long-read sequencing; PacBio and ONT; SV caller; Saccharomyces cerevisiae; HUMAN GENOME; INSIGHTS; IMPACT;
D O I
10.3389/fgene.2020.00159
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Structural variation (SV) represents a major form of genetic variations that contribute to polymorphic variations, human diseases, and phenotypes in many organisms. Long-read sequencing has been successfully used to identify novel and complex SVs. However, comparison of SV detection tools for long-read sequencing datasets has not been reported. Therefore, we developed an analysis workflow that combined two alignment tools (NGMLR and minimap2) and five callers (Sniffles, Picky, smartie-sv, PBHoney, and NanoSV) to evaluate the SV detection in six datasets of Saccharomyces cerevisiae. The accuracy of SV regions was validated by re-aligning raw reads in diverse alignment tools, SV callers, experimental conditions, and sequencing platforms. The results showed that SV detection between NGMLR and minimap2 was not significant when using the same caller. The PBHoney was with the highest average accuracy (89.04%) and Picky has the lowest average accuracy (35.85%). The accuracy of NanoSV, Sniffles, and smartie-sv was 68.67%, 60.47%, and 57.67%, respectively. In addition, smartie-sv and NanoSV detected the most and least number of SVs, and SV detection from the PacBio sequencing platform was significantly more than that from ONT (p = 0.000173).
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Transcriptome variation in human tissues revealed by long-read sequencing
    Dafni A. Glinos
    Garrett Garborcauskas
    Paul Hoffman
    Nava Ehsan
    Lihua Jiang
    Alper Gokden
    Xiaoguang Dai
    François Aguet
    Kathleen L. Brown
    Kiran Garimella
    Tera Bowers
    Maura Costello
    Kristin Ardlie
    Ruiqi Jian
    Nathan R. Tucker
    Patrick T. Ellinor
    Eoghan D. Harrington
    Hua Tang
    Michael Snyder
    Sissel Juul
    Pejman Mohammadi
    Daniel G. MacArthur
    Tuuli Lappalainen
    Beryl B. Cummings
    Nature, 2022, 608 : 353 - 359
  • [22] in silico Long-Read Sequencing from FFPE Solid Tumor Tissue for Structural Variation Detection and Phasing in Archival Specimens
    Costa, H. A.
    Blanchette, M.
    Bustamante, C. D.
    Green, R. E.
    Hadley, P. D.
    Kunder, C.
    Putnam, N.
    Rice, B.
    Trolf, C.
    Zehnder, J. L.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2017, 19 (06): : 1034 - 1035
  • [23] A survey of algorithms for the detection of genomic structural variants from long-read sequencing data
    Mian Umair Ahsan
    Qian Liu
    Jonathan Elliot Perdomo
    Li Fang
    Kai Wang
    Nature Methods, 2023, 20 : 1143 - 1158
  • [24] A survey of algorithms for the detection of genomic structural variants from long-read sequencing data
    Ahsan, Mian Umair
    Liu, Qian
    Perdomo, Jonathan Elliot
    Fang, Li
    Wang, Kai
    NATURE METHODS, 2023, 20 (08) : 1143 - 1158
  • [25] Full characterization of unresolved structural variation through long-read sequencing and optical genome mapping
    De Clercq, Griet
    Vantomme, Lies
    Dewaele, Barbara
    Callewaert, Bert
    Vanakker, Olivier
    Janssens, Sandra
    Loeys, Bart
    Strazisar, Mojca
    De Coster, Wouter
    Vermeesch, Joris Robert
    Dheedene, Annelies
    Menten, Bjoern
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [26] long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data
    Amarasinghe, Shanika L.
    Ritchie, Matthew E.
    Gouil, Quentin
    GIGASCIENCE, 2021, 10 (02):
  • [27] Long-read direct RNA sequencing of the mitochondrial transcriptome of Saccharomyces cerevisiae reveals condition-dependent intron abundance
    Koster, Charlotte C.
    Kleefeldt, Askar A.
    van den Broek, Marcel
    Luttik, Marijke
    Daran, Jean-Marc
    Daran-Lapujade, Pascale
    YEAST, 2024, 41 (04) : 256 - 278
  • [28] Comprehensive assessment of long-read sequencing platforms and calling algorithms for detection of copy number variation
    Yuan, Na
    Jia, Peilin
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (05)
  • [29] Genome sequencing using long-read sequencing
    McEwen, Juan Guillermo
    Gomez, Oscar Mauricio
    REVISTA DE LA ACADEMIA COLOMBIANA DE CIENCIAS EXACTAS FISICAS Y NATURALES, 2023, 47 (183): : 439 - 444
  • [30] Long-read sequencing and optical genome mapping enable full characterization of previously unresolved structural variation
    De Clercq, Griet
    Vantomme, Lies
    Callewaert, Bert
    Vergult, Sarah
    Dewaele, Barbara
    Vermeesch, Joris
    De Coster, Wouter
    Strazisar, Mojca
    De Pooter, Tim
    Dheedene, Annelies
    Menten, Bjorn
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 278 - 278