Applications of Machine Learning for the Classification of Porcine Reproductive and Respiratory Syndrome Virus Sublineages Using Amino Acid Scores of ORF5 Gene

被引:5
|
作者
Kim, Jeonghoon [1 ]
Lee, Kyuyoung [2 ]
Rupasinghe, Ruwini [2 ]
Rezaei, Shahbaz [3 ]
Martinez-Lopez, Beatriz [2 ]
Liu, Xin [3 ]
机构
[1] Univ Calif Davis, Dept Math, Davis, CA 95616 USA
[2] Univ Calif Davis, Sch Vet Med, Ctr Anim Dis Modeling & Surveillance CADMS, Dept Med & Epidemiol, Davis, CA 95616 USA
[3] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
基金
美国国家科学基金会;
关键词
artificial intelligence; random forest; k-nearest neighbor; support vector machine; swine health; phylogenetic tree; multilayer perceptron; classification; IDENTIFICATION; DETERMINANTS; PATTERN;
D O I
10.3389/fvets.2021.683134
中图分类号
S85 [动物医学(兽医学)];
学科分类号
0906 ;
摘要
Porcine reproductive and respiratory syndrome is an infectious disease of pigs caused by PRRS virus (PRRSV). A modified live-attenuated vaccine has been widely used to control the spread of PRRSV and the classification of field strains is a key for a successful control and prevention. Restriction fragment length polymorphism targeting the Open reading frame 5 (ORF5) genes is widely used to classify PRRSV strains but showed unstable accuracy. Phylogenetic analysis is a powerful tool for PRRSV classification with consistent accuracy but it demands large computational power as the number of sequences gets increased. Our study aimed to apply four machine learning (ML) algorithms, random forest, k-nearest neighbor, support vector machine and multilayer perceptron, to classify field PRRSV strains into four clades using amino acid scores based on ORF5 gene sequence. Our study used amino acid sequences of ORF5 gene in 1931 field PRRSV strains collected in the US from 2012 to 2020. Phylogenetic analysis was used to labels field PRRSV strains into one of four clades: Lineage 5 or three clades in Linage 1. We measured accuracy and time consumption of classification using four ML approaches by different size of gene sequences. We found that all four ML algorithms classify a large number of field strains in a very short time (<2.5 s) with very high accuracy (>0.99 Area under curve of the Receiver of operating characteristics curve). Furthermore, the random forest approach detects a total of 4 key amino acid positions for the classification of field PRRSV strains into four clades. Our finding will provide an insightful idea to develop a rapid and accurate classification model using genetic information, which also enables us to handle large genome datasets in real time or semi-real time for data-driven decision-making and more timely surveillance.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Sequence and Phylogenetic Analyses of the Nsp2 and ORF5 Genes of Porcine Reproductive and Respiratory Syndrome Virus in Boars from South China in 2015
    Wang, P. P.
    Dong, J. G.
    Zhang, L. Y.
    Liang, P. S.
    Liu, Y. L.
    Wang, L.
    Fan, F. H.
    Song, C. X.
    TRANSBOUNDARY AND EMERGING DISEASES, 2017, 64 (06) : 1953 - 1964
  • [42] Dynamics and evolution of porcine reproductive and respiratory syndrome virus (PRRSV) ORF5 following modified live PRRSV vaccination in a PRRSV-infected herd
    Nilubol, Dachrit
    Tripipat, Thitima
    Hoonsuwan, Tawatchai
    Tipsombatboon, Pavita
    Piriyapongsa, Jittima
    ARCHIVES OF VIROLOGY, 2014, 159 (01) : 17 - 27
  • [43] Cysteine residues of the porcine reproductive and respiratory syndrome virus ORF5a protein are not essential for virus viability
    Sun, Lichang
    Zhou, Yan
    Liu, Runxia
    Li, Yanhua
    Gao, Fei
    Wang, Xiaomin
    Fan, Hongjie
    Yuan, Shishan
    Wei, Zuzhang
    Tong, Guangzhi
    VIRUS RESEARCH, 2015, 197 : 17 - 25
  • [44] Genetic diversity and evolutionary characterization of Chinese porcine reproductive and respiratory syndrome viruses based on NSP2 and ORF5
    Liu, Jian-Kui
    Wei, Chun-Hua
    Yang, Xiao-Yan
    Hou, Xi-Lin
    Dai, AI-Ling
    Li, Xiao-Hua
    Wei, Mei-Kang
    Pan, Xiu-Zhen
    ARCHIVES OF VIROLOGY, 2013, 158 (08) : 1811 - 1816
  • [45] Genetic diversity and evolutionary characterization of Chinese porcine reproductive and respiratory syndrome viruses based on NSP2 and ORF5
    Jian-Kui Liu
    Chun-Hua Wei
    Xiao-Yan Yang
    Xi-Lin Hou
    AI-Ling Dai
    Xiao-Hua Li
    Mei-Kang Wei
    Xiu-Zhen Pan
    Archives of Virology, 2013, 158 : 1811 - 1816
  • [46] Emergence of novel porcine reproductive and respiratory syndrome viruses (ORF5 RFLP 1-7-4 viruses) in China
    Zhang, Hong-Liang
    Zhang, Wen-Li
    Xiang, Li-Run
    Leng, Chao-Liang
    Tian, Zhi-Jun
    Tang, Yan-Dong
    Cai, Xue-Hui
    VETERINARY MICROBIOLOGY, 2018, 222 : 105 - 108
  • [47] Genetic Analysis of the ORF7 Gene in Vietnamese Porcine Reproductive and Respiratory Syndrome Virus (PRRSV)
    Nguyen Thi Dieu Thuy
    Nguyen Thi Thu
    Nguyen Giang Son
    Le Thi Thu Ha
    Do Vo Anh Khoa
    KAFKAS UNIVERSITESI VETERINER FAKULTESI DERGISI, 2015, 21 (05) : 745 - 751
  • [48] Molecular variation in the nucleoprotein gene (ORF7) of the porcine reproductive and respiratory syndrome virus (PRRSV)
    Le Gall, A
    Legeay, O
    Bourhy, H
    Arnauld, C
    Albina, E
    Jestin, A
    VIRUS RESEARCH, 1998, 54 (01) : 9 - 21
  • [49] Identification of genetically diverse sequences (ORF 5) of porcine reproductive and respiratory syndrome virus in a swine herd
    Dee, SA
    Torremorell, M
    Rossow, K
    Mahlum, C
    Otake, S
    Faaberg, K
    CANADIAN JOURNAL OF VETERINARY RESEARCH-REVUE CANADIENNE DE RECHERCHE VETERINAIRE, 2001, 65 (04): : 254 - 260
  • [50] Proteomic characterization of a novel structural protein ORF5a of porcine reproductive and respiratory syndrome virus
    Oh, Jongsuk
    Lee, Changhee
    VIRUS RESEARCH, 2012, 169 (01) : 255 - 263