Distinguishing useful microseismic signals is a critical step in microseismic monitoring. Here, we present the time series contrastive clustering (TSCC) method, an end-to-end unsupervised model for clustering microseismic signals that uses a contrastive learning network and a centroidal-based clustering model. The TSCC framework consists of two successive phases: pretraining and fine-tuning. In the pretraining phase, two random cropping augmentations are used to transform the time series microseismic data into two distinct but correlated views. Then, the multiscale temporal and instance contrasting learning are used to discriminate between negative and positive views, thus motivating the encoder to capture microseismic signal contextual information from multiple perspectives and generate distinct representations from unlabeled data. During the fine-tuning phase, the encoder weights are iteratively fine-tuned by simultaneously performing contrast learning and clustering. The corresponding loss is a weighted combination of the contrastive and clustering loss functions, which induces the encoder to learn representations that improve the clustering performance. The test results demonstrate that the proposed method can achieve better clustering accuracy (ACC) than popular clustering methods, including $k$ -means, deep embedding clustering (DEC), unsupervised clustering with deep convolutional autoencoders (DCAs), and deep clustering with self-supervision (DCSS). Moreover, the TSCC model can produce results comparable to supervised deep learning approaches while requiring no labeled data, manual feature extraction, or large training datasets. In practice, the TSCC model has a clustering ACC of 98.07% and a normalized mutual information (NMI) of 86.26%.