The rich spatial and spectral information in hyperspectral images (HSIs) makes spectral-spatial relationships essential for HSI classification (HSIC). Recent advancements indicate convolutional neural networks (CNNs) excel in HSIC but often struggle with precise spectral feature extraction. Moreover, the abundance of spectral information presents challenges in efficient feature representation and minimizing cross-domain interference. To address these limitations, we propose an efficient sequential spectral-spatial feature convolution network (S3FCN), employing successive subnetworks for spectral and spatial feature extraction with depthwise separable convolution. This approach balances the preservation of deep spectral and spatial features while significantly reducing network parameters, enhancing both performance and computational efficiency. We also introduce a sequential spectral-spatial attention module (S3AM) to integrate cross-domain correlations. This module utilizes spectral features from the preceding subnetwork and multilevel residual layers for in-depth exploration of spatial features, enabling deep integration for improved classification performance. The proposed architecture's effectiveness is verified on five benchmark HSI datasets, including Pavia University, Salinas Valley, Kennedy Space Center, Indian Pines, and Houston 2013. Experimental results demonstrate that the sequential spectral-spatial connection in the feature extraction and attention mechanism integrated with depthwise separable convolution collectively surpasses current state-of-the-art (SOTA) techniques in classification accuracy with overall accuracies of 98.28%, 97.63%, 99.31%, 96.72%, and 95.38% across different datasets, while limiting the computation overhead, ensuring balanced network efficiency.