With the advent of the big data era, data-driven analysis to realize the mining of internal laws of data has gradually become a developmental trend in sewage management and decision-making. This paper explored the regression algorithm in machine learning (ML) including linear model, decision tree, ensemble learning, nearest neighbor model, support vector machine to predict the water quality of membrane bioreactor (MBR) in treating high salt ammonia nitrogen wastewater. Using conventional online monitoring data as characteristic variables, the concentrations of NH4+-N-out, NO3--N-out, NO2--N-out, CODout, and TNout were targeted for prediction. The parameter-adjusted learning curve and grid-search strategy optimized each training model. The importance of each input feature on the prediction target was analyzed. Finally, to better use wastewater enterprises' long-term accumulated water quality data and reduce the management difficulty and operation cost, this paper proposed a dataset contribution degree (DCD) analysis method by combining ML and evaluation indicators. The results showed that the integration algorithms had better performance in the training process and the accuracy of prediction results. Combining the datasets improved the model's prediction performance, and the raw data accumulation under the impact of salinity enhanced saline wastewater predictability.