Detecting low-diameter clusters is an important graph-based data mining technique used in social network analysis, bioinformatics and text-mining. Low pairwise distances within a cluster can facilitate fast communication or good reachability between vertices in the cluster. Formally, a subset of vertices that induce a subgraph of diameter at most k is called a k-club. For low values of the parameter k, this model offers a graph-theoretic relaxation of the clique model that formalizes the notion of a low-diameter cluster. Using a combination of graph decomposition and model decomposition techniques, we demonstrate how the fundamental optimization problem of finding a maximum size k-club can be solved optimally on large-scale benchmark instances that are available in the public domain. Our approach circumvents the use of complicated formulations of the maximum k-club problem in favor of a simple relaxation based on necessary conditions, combined with canonical hypercube cuts introduced by Balas and Jeroslow.