Applications of Machine Learning for the Classification of Porcine Reproductive and Respiratory Syndrome Virus Sublineages Using Amino Acid Scores of ORF5 Gene

被引:5
|
作者
Kim, Jeonghoon [1 ]
Lee, Kyuyoung [2 ]
Rupasinghe, Ruwini [2 ]
Rezaei, Shahbaz [3 ]
Martinez-Lopez, Beatriz [2 ]
Liu, Xin [3 ]
机构
[1] Univ Calif Davis, Dept Math, Davis, CA 95616 USA
[2] Univ Calif Davis, Sch Vet Med, Ctr Anim Dis Modeling & Surveillance CADMS, Dept Med & Epidemiol, Davis, CA 95616 USA
[3] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
基金
美国国家科学基金会;
关键词
artificial intelligence; random forest; k-nearest neighbor; support vector machine; swine health; phylogenetic tree; multilayer perceptron; classification; IDENTIFICATION; DETERMINANTS; PATTERN;
D O I
10.3389/fvets.2021.683134
中图分类号
S85 [动物医学(兽医学)];
学科分类号
0906 ;
摘要
Porcine reproductive and respiratory syndrome is an infectious disease of pigs caused by PRRS virus (PRRSV). A modified live-attenuated vaccine has been widely used to control the spread of PRRSV and the classification of field strains is a key for a successful control and prevention. Restriction fragment length polymorphism targeting the Open reading frame 5 (ORF5) genes is widely used to classify PRRSV strains but showed unstable accuracy. Phylogenetic analysis is a powerful tool for PRRSV classification with consistent accuracy but it demands large computational power as the number of sequences gets increased. Our study aimed to apply four machine learning (ML) algorithms, random forest, k-nearest neighbor, support vector machine and multilayer perceptron, to classify field PRRSV strains into four clades using amino acid scores based on ORF5 gene sequence. Our study used amino acid sequences of ORF5 gene in 1931 field PRRSV strains collected in the US from 2012 to 2020. Phylogenetic analysis was used to labels field PRRSV strains into one of four clades: Lineage 5 or three clades in Linage 1. We measured accuracy and time consumption of classification using four ML approaches by different size of gene sequences. We found that all four ML algorithms classify a large number of field strains in a very short time (<2.5 s) with very high accuracy (>0.99 Area under curve of the Receiver of operating characteristics curve). Furthermore, the random forest approach detects a total of 4 key amino acid positions for the classification of field PRRSV strains into four clades. Our finding will provide an insightful idea to develop a rapid and accurate classification model using genetic information, which also enables us to handle large genome datasets in real time or semi-real time for data-driven decision-making and more timely surveillance.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] THE ORF5 VARIATION OF VIETNAMESE PORCINE REPRODUCTIVE AND RESPIRATORY SYNDROME VIRUS STRAINS
    Chi Nguyen Quynh Ho
    Son Nghia Hoang
    Thao Thi Phuong Nguyen
    Chung Chinh Doan
    Mai Thi Phuong Nguyen
    Trinh Huu Le
    Hoai Thi Thu Nguyen
    Long Thanh Le
    SLOVENIAN VETERINARY RESEARCH, 2017, 54 (03) : 125 - 132
  • [2] Genetic analysis of ORF5 in porcine reproductive and respiratory syndrome virus in Japan
    Iseki, Hiroshi
    Takagi, Michihiro
    Miyazaki, Ayako
    Katsuda, Ken
    Mikami, Osamu
    Tsunemitsu, Hiroshi
    MICROBIOLOGY AND IMMUNOLOGY, 2011, 55 (03) : 211 - 216
  • [3] Evidence for the adaptive evolution of ORF5 gene of Porcine reproductive and respiratory syndrome virus isolated in China
    Xu, Z.
    Chang, X.
    Xiao, S.
    Chen, H.
    Zhou, R.
    ACTA VIROLOGICA, 2010, 54 (04) : 281 - 285
  • [4] Cloning and Sequence Analysis of ORF5 Gene of Porcine Reproductive and Respiratory Syndrome Virus of the ZJ Strain
    Hao, Baocheng
    Niu, Dengyuan
    Xing, Xiaoyong
    Xiang, Haitao
    Wen, Fengqin
    Fu, Xiaoping
    Liang, Jianping
    Hu, Yonghao
    THAI JOURNAL OF VETERINARY MEDICINE, 2014, 44 (02): : 261 - 267
  • [5] Genetic analysis of ORF5 porcine reproductive and respiratory syndrome virus isolated in Vietnam
    Nguyen Thi Dieu Thuy
    Nguyen Thi Thu
    Nguyen Giang Son
    Le Thi Thu Ha
    Vo Khanh Hung
    Nguyen Thao Nguyen
    Do Vo Anh Khoa
    MICROBIOLOGY AND IMMUNOLOGY, 2013, 57 (07) : 518 - 526
  • [6] Temporal lineage dynamics of the ORF5 gene of porcine reproductive and respiratory syndrome virus in Korea in 2014–2019
    Seung-Chai Kim
    Chang-Gi Jeong
    Gyeong-Seo Park
    Ji-Young Park
    Hye-Young Jeoung
    Go-Eun Shin
    Mi-Kyeong Ko
    Seoung-Hee Kim
    Kyoung-Ki Lee
    Won-Il Kim
    Archives of Virology, 2021, 166 : 2803 - 2815
  • [7] Genetic variation and phylogenetic analyses of the ORF5 gene of acute porcine reproductive and respiratory syndrome virus isolates
    Key, KF
    Haqshenas, G
    Guenette, DK
    Swenson, SL
    Toth, TE
    Meng, XJ
    VETERINARY MICROBIOLOGY, 2001, 83 (03) : 249 - 263
  • [8] Temporal lineage dynamics of the ORF5 gene of porcine reproductive and respiratory syndrome virus in Korea in 2014-2019
    Kim, Seung-Chai
    Jeong, Chang-Gi
    Park, Gyeong-Seo
    Park, Ji-Young
    Jeoung, Hye-Young
    Shin, Go-Eun
    Ko, Mi-Kyeong
    Kim, Seoung-Hee
    Lee, Kyoung-Ki
    Kim, Won-Il
    ARCHIVES OF VIROLOGY, 2021, 166 (10) : 2803 - 2815
  • [9] Genetic diversity of the ORF5 gene of porcine reproductive and respiratory syndrome virus (PRRSV) genotypes I and II in Thailand
    Nilubol, Dachrit
    Tripipat, Thitima
    Hoonsuwan, Tawatchai
    Tipsombatboon, Pavita
    Piriyapongsa, Jittima
    ARCHIVES OF VIROLOGY, 2013, 158 (05) : 943 - 953
  • [10] GENETIC DIVERSITY ANALYSIS OF THE ORF5 GENE IN PORCINE REPRODUCTIVE AND RESPIRATORY SYNDROME VIRUS SAMPLES FROM SOUTH CHINA
    Cao, Zong-Xi
    Jiao, Pei-Rong
    Huang, Yu-Mao
    Qin, Hong-Yang
    Kong, Liu-Wu
    Pan, Quan-Hui
    He, Yi-Min
    Zhang, Gui-Hong
    ACTA VETERINARIA HUNGARICA, 2012, 60 (01) : 157 - 164