In practical industrial applications, rolling bearing fault diagnosis faces significant challenges due to the difficulty in collecting fault data, resulting in a scarcity of available data. This scarcity undermines the accuracy, robustness, and generalization capabilities of diagnostics in complex scenarios. Furthermore, traditional methods perform poorly under conditions of limited data and complex operating environments. To address these challenges, a prior knowledge embedding contrastive attention learning network (PKECALN) is proposed. PKECALN integrates feature extraction, prior knowledge (PK) embedding, and fault classification into a unified framework based on contrastive learning (CL). The proposed approach employs a 1-D deep convolutional neural network (1D-DCNN) combined with a custom-designed sequential attention module (SAM) to deeply extract multiscale time-frequency fault features. In addition, the use of CL effectively mitigates the problem of data scarcity. The model leverages a PK embedding mechanism, achieving a dual-drive approach of data and knowledge. This mechanism enables the model to focus on critical feature frequency information and guides the learning of fundamental characteristics of fault signals, thereby enhancing the accuracy of bearing fault diagnosis. A composite loss function tailored for this network is designed using contrastive loss, cross-entropy loss, and mean squared error (mse). Two case studies validate the feasibility and effectiveness of PKECALN in complex application scenarios, such as small-sample sizes and variable speeds. In addition, one of these case studies includes ablation experiments and interpretability analysis.