Technology integration in modern education has transformed traditional teaching-learning methods, but maintaining student attentiveness during computer-aided activities remains challenging. Neuroimaging advancements provide valuable insights into cognitive processes. This study measures cognitive load during computer-aided education. We have collected functional near-infrared spectroscopy (fNIRS) brain signals while subjects perform mental tasks and rest. Three datasets have been considered to evaluate the performance of the proposed model. The first two datasets are open-access, and we prepare the third dataset by collecting fNIRS brain signals from 14 healthy subjects. Two feature extraction techniques are proposed: manual and automatic based on wavelet scattering transform (WST). A one dimensional convolutional neural network (1D CNN) is also proposed to automatically extract features through feature engineering and classification. For comparison, four machine learning classifiers, linear discriminant analysis (LDA), Naive Bayes (NB), k-nearest neighbors (KNN) and support vector machine (SVM), are also considered. Classification performance is evaluated using accuracy, precision, recall and F1-score across all datasets. Computational cost, i.e., the CPU time and memory utilization for extracting the features and testing the classifiers, is also evaluated. The results suggest that when considering four classifiers across three datasets and comparing among the manual and the WST-based feature extraction methods, the average performance of 1D CNN is superior in terms of classification accuracy (1.16 times higher), precision (1.10 times higher), recall (1.10 times higher) and F1-score (1.09 times higher). However, the CPU time and memory utilization for 1D CNN are significantly higher, 10.09 and 14.70 times, respectively. In comparison to four state-of-the-art deep learning models, the proposed 1D CNN also shows best classification accuracy (92.99%). The analysis of the results shows that identifying cognitive load, SVM with Gaussian kernel function on WST based methods, provides satisfactory classification performance with significantly less CPU time and memory utilization.