Considering that group comparisons are common in social science, we examined two latent group mean testing methods when groups of interest were either at the between or within level of multilevel data: multiple-group multilevel confirmatory factor analysis (MG ML CFA) and multilevel multiple-indicators multiple-causes modeling (ML MIMIC). The performance of these methods were investigated through three Monte Carlo studies. In Studies 1 and 2, either factor variances or residual variances were manipulated to be heterogeneous between groups. In Study 3, which focused on within-level multiple-group analysis, six different model specifications were considered depending on how to model the intra-class group correlation (i.e., correlation between random effect factors for groups within cluster). The results of simulations generally supported the adequacy of MG ML CFA and ML MIMIC for multiple-group analysis with multilevel data. The two methods did not show any notable difference in the latent group mean testing across three studies. Finally, a demonstration with real data and guidelines in selecting an appropriate approach to multilevel multiple-group analysis are provided.