There is a broad literature in multiple-choice test development, both in terms of item-writing guidelines, and psychometric functionality as a measurement tool. However, most of the published literature concerns multiple-choice testing in the context of expert-designed high-stakes standardized assessments, with little attention being paid to the use of the technique within non-expert instructor-created classroom examinations. In this work, we present a quantitative analysis of a large corpus of multiple-choice tests deployed in the classrooms of a primarily undergraduate university in Canada. Our report aims to establish three related things. First, reporting on the functional and psychometric operation of 182 multiple-choice tests deployed in a variety of courses at all undergraduate levels of education establishes a much-needed baseline for actual as-deployed classroom tests. Second, we motivate and present modified statistical measures-such as item-excluded correlation measures of discrimination and length-normalized measures of reliability-that should serve as useful parameters for future comparisons of classroom test psychometrics. Finally, we use the broad empirical data from our survey of tests to update widely used item-quality guidelines.