Predicting Lactobacillus delbrueckii subsp. bulgaricus-Streptococcus thermophilus interactions based on a highly accurate semi-supervised learning method

被引:0
|
作者
Shujuan Yang [1 ,2 ,3 ,4 ]
Mei Bai [1 ,2 ,3 ,4 ]
Weichi Liu [5 ,6 ]
Weicheng Li [1 ,2 ,3 ,4 ]
Zhi Zhong [1 ,2 ,3 ,4 ]
LaiYu Kwok [1 ,2 ,3 ,4 ]
Gaifang Dong [5 ,6 ]
Zhihong Sun [1 ,2 ,3 ,4 ]
机构
[1] Key Laboratory of Dairy Biotechnology and Engineering,Ministry of Education,Inner Mongolia Agricultural University
[2] Key Laboratory of Dairy Products Processing,Ministry of Agriculture and Rural Affairs,Inner Mongolia Agricultural University
[3] Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering,Inner Mongolia Agricultural University
[4] Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products,Ministry of Education,Inner Mongolia Agricultural University
[5] College of Computer and Information Engineering,Inner Mongolia Agricultural University
[6] Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal
关键词
D O I
暂无
中图分类号
TS201.3 [食品微生物学]; TP18 [人工智能理论];
学科分类号
082203 ; 081104 ; 0812 ; 0835 ; 1405 ;
摘要
Lactobacillus delbrueckii subsp. bulgaricus(L. bulgaricus) and Streptococcus thermophilus(S. thermophilus) are commonly used starters in milk fermentation. Fermentation experiments revealed that L. bulgaricus-S. thermophilus interactions(Lb St I) substantially impact dairy product quality and production. Traditional biological humidity experiments are time-consuming and labor-intensive in screening interaction combinations, an artificial intelligence-based method for screening interactive starter combinations is necessary. However,in the current research on artificial intelligence based interaction prediction in the field of bioinformatics, most successful models adopt supervised learning methods, and there is a lack of research on interaction prediction with only a small number of labeled samples.Hence, this study aimed to develop a semi-supervised learning framework for predicting Lb St I using genomic data from 362 isolates(181per species). The framework consisted of a two-part model: a co-clustering prediction model(based on the Kyoto Encyclopedia of Genes and Genomes(KEGG) dataset) and a Laplacian regularized least squares prediction model(based on K-mer analysis and gene composition of all isolates datasets). To enhance accuracy, we integrated the separate outcomes produced by each component of the two-part model to generate the ultimate Lb St I prediction results, which were verified through milk fermentation experiments. Validation through milk fermentation experiments confirmed a high precision rate of 85%(17/20; validated with 20 randomly selected combinations of expected interacting isolates). Our data suggest that the biosynthetic pathways of cysteine, riboflavin, teichoic acid, and exopolysaccharides, as well as the ATP-binding cassette transport systems, contribute to the mutualistic relationship between these starter bacteria during milk fermentation. However, this finding requires further experimental verification. The presented model and data are valuable resources for academics and industry professionals interested in screening dairy starter cultures and understanding their interactions.
引用
收藏
页码:558 / 574
页数:17