A High-Performance Database Management System for Managing and Analyzing Large-Scale SNP Data in Plant Genotyping and Breeding Applications

被引:2
|
作者
Zhao, Yikun [1 ]
Jiang, Bin [1 ]
Huo, Yongxue [1 ]
Yi, Hongmei [1 ]
Tian, Hongli [1 ]
Wu, Haotian [1 ]
Wang, Rui [1 ]
Zhao, Jiuran [1 ]
Wang, Fengge [1 ]
机构
[1] Beijing Acad Agr & Forest Sci BAAFS, Maize Res Ctr, Beijing Key Lab Maize DNA Fingerprinting & Mol Br, Beijing 100097, Peoples R China
来源
AGRICULTURE-BASEL | 2021年 / 11卷 / 11期
基金
国家重点研发计划;
关键词
SNP; SNP array; KASP; database; DNA fingerprint; algorithms; genotyping; DNA; DIVERSITY; SEQUENCE; BARCODE;
D O I
10.3390/agriculture11111027
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
A DNA fingerprint database is an efficient, stable, and automated tool for plant molecular research that can provide comprehensive technical support for multiple fields of study, such as pan-genome analysis and crop breeding. However, constructing a DNA fingerprint database for plants requires significant resources for data output, storage, analysis, and quality control. Large amounts of heterogeneous data must be processed efficiently and accurately. Thus, we developed plant SNP database management system (PSNPdms) using an open-source web server and free software that is compatible with single nucleotide polymorphism (SNP), insertion-deletion (InDel) markers, Kompetitive Allele Specific PCR (KASP), SNP array platforms, and 23 species. It fully integrates with the KASP platform and allows for graphical presentation and modification of KASP data. The system has a simple, efficient, and versatile laboratory personnel management structure that adapts to complex and changing experimental needs with a simple workflow process. PSNPdms internally provides effective support for data quality control through multiple dimensions, such as the standardized experimental design, standard reference samples, fingerprint statistical selection algorithm, and raw data correlation queries. In addition, we developed a fingerprint-merging algorithm to solve the problem of merging fingerprints of mixed samples and single samples in plant detection, providing unique standard fingerprints of each plant species for construction of a standard DNA fingerprint database. Different laboratories can use the system to generate fingerprint packages for data interaction and sharing. In addition, we integrated genetic analysis into the system to enable drawing and downloading of dendrograms. PSNPdms has been widely used by 23 institutions and has proven to be a stable and effective system for sharing data and performing genetic analysis. Interested researchers are required to adapt and further develop the system.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] A High-Performance Routing Engine for Large-Scale FPGAs
    Martin, Timothy
    Maarouf, Dani
    Grewal, Gary
    Areibi, Shawki
    2024 34TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL 2024, 2024, : 53 - 59
  • [22] Large-Scale Integrated Photonics for High-Performance Interconnects
    Beausoleil, Raymond G.
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2011, 7 (02)
  • [23] Large-Scale Integrated Photonics for High-Performance Interconnects
    Beausoleil, R. G.
    2011 IEEE PHOTONICS CONFERENCE (PHO), 2011, : 326 - 327
  • [24] Large-Scale Integrated Photonics for High-Performance Interconnects
    Beausoleil, R. G.
    2012 IEEE PHOTONICS CONFERENCE (IPC), 2012, : 274 - 275
  • [25] VERY LARGE DATA AMOUNT AND HIGH-PERFORMANCE ORIENTED DATABASE MANAGEMENT-SYSTEMS
    TAKAHIRA, S
    MURAI, M
    KAWAZU, S
    SUZUKI, K
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1980, 28 (3-4): : 229 - 245
  • [26] Survey of Large-Scale Data Management Systems for Big Data Applications
    Wu, Lengdong
    Yuan, Liyan
    You, Jiahuai
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2015, 30 (01) : 163 - 183
  • [27] Survey of Large-Scale Data Management Systems for Big Data Applications
    Lengdong Wu
    Liyan Yuan
    Jiahuai You
    Journal of Computer Science and Technology, 2015, 30 : 163 - 183
  • [28] High-Performance Machine Learning for Large-Scale Data Classification considering Class Imbalance
    Liu, Yang
    Li, Xiang
    Chen, Xianbang
    Wang, Xi
    Li, Huaqiang
    SCIENTIFIC PROGRAMMING, 2020, 2020 (2020)
  • [29] CLUST - Grouping Aware Data Placement for Improving the Performance of Large-Scale Data Management System
    Vengadeswaran, Shanmugasundaram
    Balasundaram, Sadhu Ramakrishnan
    PROCEEDINGS OF THE 7TH ACM IKDD CODS AND 25TH COMAD (CODS-COMAD 2020), 2020, : 1 - 9
  • [30] A Data-Centric Approach for Analyzing Large-Scale Deep Learning Applications
    Vineet, S. Sai
    Joseph, Natasha Meena
    Korgaonkar, Kunal
    Paul, Arnab K.
    PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING, ICDCN 2023, 2023, : 282 - 283