Antenna selection is a multiple-input multiple-output (MIMO) technology, which uses radio frequency (RF) switches to select a good subset of antennas. Antenna selection can alleviate the requirement on the number of RF transceivers, thus being attractive for massive MIMO systems. In massive MIMO antenna selection systems, RF switching architectures need to be carefully considered. In this paper, we examine two switching architectures, i.e., full-array and sub-array. By assuming independent and identically distributed Rayleigh flat fading channels, we use asymptotic theory on order statistics to derive the asymptotic upper capacity bounds of massive MIMO channels with antenna selection for the both switching architectures in the large-scale limit. We also use the derived bounds to further derive the upper bounds of the ergodic achievable spectral efficiency considering the channel state information (CSI) acquisition. It is also showed that the ergodic capacity of sub-array antenna selection system scales no faster than double logarithmic rate. In addition, optimal antenna selection algorithms based on branch-and-bound are proposed for both switching architectures. Our results show that the derived asymptotic bounds are effective and also apply to the finite-dimensional MIMO. The CSI acquisition is one of the main limits for the massive MIMO antenna selection systems in the time-variant channels. The proposed optimal antenna selection algorithms are much faster than the exhaustive-search-based antenna selection, e.g., 1000 x speedup observed in the large-scale system. Interestingly, the full-array and sub-array systems have very close performance, which is validated by their exact capacities and their close upper bounds on capacity.