Machine Learning Approaches for Cyanobacteria Bloom Prediction using metagenomic sequence data, a case study

被引:0
|
作者
Huang, JianDong [1 ]
Zheng, Huiru [1 ]
Wang, Haying [1 ]
Jiang, Xingpeng [2 ]
机构
[1] Ulster Univ Jordanstown, Sch Comp, Jordanstown, Antrim, North Ireland
[2] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China
关键词
Machine Learning; Cyanobacteria blooms; OTU (Operational Taxonomic Unit); SUPPORT VECTOR MACHINES; MICROCYSTINS; RIVER;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cyanobacteria bloom is a serious public health threat and a global challenge. Literature on the bloom prediction and forecasting has been accumulating and the emphasis appears to have been on the relation between the blooms and environmental factors, whilst the complexity of the bloom mechanism makes it difficult to reach adequate output of the models. Rapid development of next generation sequencing techniques provides a way in which comprehensive and quick examination of the microbial community can be achieved, especially for the bloom community structure. This facilitates using of merely the sequence data along with the machine learning techniques to predict and forecast the bloom occurrence. But there has been rare report on this theme in the literature. In this case study, machine learning approaches were applied with the metagenomic data as the only input (rather than with environmental data) to predict the Cyanobacteria blooms. k-NN classification, SVM classification and k-means clustering were applied and their efficiencies were evaluated using relevant indices. Feature selection was performed and the yielded sub datasets were worked on seriatim. In the predicting experiment with k-NN approach, the final year's data among the 8 years OTU time series were used as target data and various combination of the preceding years' data were used as predictor data; the output came with the best values of 1.00 and 100% for the evaluation indices F1score and sensitivity, specificity, precision, and accuracy, for the 7 preceding years' predictor input, among the experiment results. This case study demonstrated the feasibility of using machine learning approaches in the Cyanobacteria bloom prediction with only metagenomic sequence data, and the importance of feature selection processing in obtaining better output of the machine learning approaches. The metagenomic data based machine learning approaches are efficient, economic, and faster, possessing the advantage and potential for being adopted as a promising means in the bloom prediction practice.
引用
收藏
页码:2054 / 2061
页数:8
相关论文
共 50 条
  • [1] Revisiting CVD Risk Prediction Using Machine Learning Approaches: A Case Study
    Dashti, Hesam
    Liu, Yanyan
    Glynn, Robert J.
    Ridker, Paul M.
    Mora, Samia
    Demler, Olga
    CIRCULATION, 2020, 141
  • [2] Prediction of Medical Conditions Using Machine Learning Approaches: Alzheimer's Case Study
    Stoleru, Georgiana Ingrid
    Iftene, Adrian
    MATHEMATICS, 2022, 10 (10)
  • [3] Prediction of Flood in Barak River using Hybrid Machine Learning Approaches: A Case Study
    Abinash Sahoo
    Sandeep Samantaray
    Dillip K. Ghose
    Journal of the Geological Society of India, 2021, 97 : 186 - 198
  • [4] Prediction of Flood in Barak River using Hybrid Machine Learning Approaches: A Case Study
    Sahoo, Abinash
    Samantaray, Sandeep
    Ghose, Dillip K.
    JOURNAL OF THE GEOLOGICAL SOCIETY OF INDIA, 2021, 97 (02) : 186 - 198
  • [5] Drinking water potability prediction using machine learning approaches: a case study of Indian rivers
    Ainapure, Bharati
    Baheti, Nidhi
    Buch, Jyot
    Appasani, Bhargav
    Jha, Amitkumar V.
    Srinivasulu, Avireni
    WATER PRACTICE AND TECHNOLOGY, 2023, 18 (12) : 3004 - 3020
  • [6] Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data
    Chen, Julie Chih-yu
    Tyler, Andrea D.
    BIOLOGY DIRECT, 2020, 15 (01)
  • [7] Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data
    Julie Chih-yu Chen
    Andrea D. Tyler
    Biology Direct, 15
  • [8] Machine learning-based approaches for cancer prediction using microbiome data
    Freitas, Pedro
    Silva, Francisco
    Sousa, Joana Vale
    Ferreira, Rui M.
    Figueiredo, Ceu
    Pereira, Tania
    Oliveira, Helder P.
    SCIENTIFIC REPORTS, 2023, 13 (01):
  • [9] Machine learning-based approaches for cancer prediction using microbiome data
    Pedro Freitas
    Francisco Silva
    Joana Vale Sousa
    Rui M. Ferreira
    Céu Figueiredo
    Tania Pereira
    Hélder P. Oliveira
    Scientific Reports, 13 (1)
  • [10] Machine Learning Approaches for Prediction of Facial Rejuvenation Using Real and Synthetic Data
    Shah, Syed Afaq Ali
    Bennamoun, Mohammed
    Molton, Michael K.
    IEEE ACCESS, 2019, 7 : 23779 - 23787