The effectiveness and performance of our proposed model are assessed for early epileptic seizure prediction, and the performance is compared with baseline models and several recently published works in the literature. The experimental model conducts the experiments of early epilepsy prediction models using Python programming language. The software environment of this experimental analysis is python running on a 64-bit Ubuntu operating system powered with a 3GHz Intel processing unit and 32GB memory.
5.1. Datasets
The experiments employ three EEG benchmark datasets, the CHB-MIT scalp EEG dataset, the Bonn EEG dataset, and New Delhi EEG dataset, to design the FL model for the proposed epileptic seizure prediction. The datasets are adjusted for the epileptic seizure prediction task by accurately discriminating the preictal class from either the interictal or ictal class. The main aim of this work is the seizure state prediction by determining the preictal state. Hence, accurately detecting the preictal state from the EEG signals (combination of preictal state with any other seizure states like interictal or ictal state). Due to the availability of the preictal and ictal classes alone in the benchmark CHB-MIT dataset, the experimentation performs for the discrimination of preictal and ictal classes in which the probability of preictal class detection only focused to further perform the postprocessing.
Preprocessed CHB-MIT scalp EEG database: It was originally gathered from the collaboration of Children's Hospital Boston and Massachusetts Institute of Technology (CHB-MIT) of patients with epileptic seizures, which were uncontrollable with medicament. This prediction model utilizes the Preprocessed CHB-MIT scalp EEG database [
45], containing separate Comma Separated Value (CSV) files of preictal and ictal data for the performance evaluation. To fit the problem of epileptic seizure prediction, patients with adequate preictal and ictal samples are selected. Due to the availability of only preictal and ictal classes in the preprocessed CHB-MIT dataset, this work discriminates the preictal state from the ictal state during the evaluation of this dataset.
Bonn EEG dataset: The University of Bonn provides the Bonn EEG dataset [
46], which comprises five distinct subsets of folders. There are 100 single-channel EEG epochs in each file, and they are digitized at a sampling rate of 173.61 Hz using 12-bit A/D resolution. Each EEG epoch contains 4097 samples with a duration of 23.6 seconds. Bonn dataset comprises the EEG observations from a 100 single-channel system. In this case, single channel refers the observations recorded from a single electrode only for each channel. As a conclusion, Bonn dataset has 100 channels that belongs to the recording type of single-channel. In the Bonn EEG dataset, sets C and D are EEG samples with interictal and preictal states, as referred from the research works [
5,
47].
New Delhi EEG dataset: The Neurology and Sleep Center (NSC) database [
48] consists of 1024 EEG samples with a duration of 5.12 seconds, sampled at 200 Hz. Among three publicly available states of ictal, preictal, and interictal in the NSC dataset, this work considers the preictal and interictal classes for evaluating the seizure prediction algorithm.
Moreover, the experiments conducted on the patient's clinical records involving the demographic data and ECG signals-based HRV features to evaluate the ANFIS-based decision-making in the postprocessing stage of the proposed system. Owing to the lack of ECG data in the benchmark EEG datasets tested in this work, several HRV features for the ECG signals are modeled from the reference of the works [
39,
40] instead of extracting the features from the unknown ECG signals. Preprocessing and examining the ECG signals is the out of the research scope of this work. Hence, to prove the influence of the ECG features on epilepsy decision-making, standard ranges of HRV features were synthesized. Furthermore, these three epileptic EEG datasets lack to comprise clinical information about each patient. Thus, to test the influence of the clinical information on the epilepsy seizure prediction, patient-specific clinical information randomly modeled for each dataset.
5.2. Performance Metrics
The experiment utilizes the ensuing evaluation metrics: sensitivity, specificity, accuracy, and False Positive Rate (FPR) to demonstrate the reliability of the proposed model.
Sensitivity: It is the ratio between the number of correctly classified preictal samples and the total number of preictal samples to be classified in a particular class. The recall is also known as sensitivity.
Specificity: It is the ratio between the number of correctly classified interictal samples and the total number of interictal samples that are actually classified.
Accuracy: It measures the overall performance of the model in detecting both the preictal and interictal samples.
False Positive Rate: It measures the number of false positives over the total test period.
Area Under the Curve (AUC): AUC quantitatively measures the learning model in the discrimination of true positives and true negatives, and a higher AUC score shows better performance of the learning model.
5.3. Results
This experimental study investigates variation across several baseline models and Existing Epileptic Seizure Prediction (EESP) works. The comparative baseline models are K-Nearest Neighbor (KNN), Decision Tree, Support Vector Machine (SVM), CNN, and LSTM, whereas the EESP works are EESP1 [
24], EESP2 [
25], and EESP3 [
27]. In this experiment, baseline algorithms were evaluated as classification models for the samples in three benchmark datasets. This section provides the results for the discrimination of the preictal state from the interictal as well as discrimination of the preictal state from the ictal state. In conclusion, the results from Bonn and NSC datasets tested on the preictal-interictal samples and results from the CHB-MIT dataset tested on the preictal-ictal samples.
Epileptic seizure prediction has been realized, and there is a different anticipation strategy. Thus, fixing the prediction time and considering the seizure onset time as a norm becomes ineffective. It is because the seizure prediction time varies from one patient to another patient and from period to another, even for the same epileptic patients. Hence, testing and evaluating the seizure prediction algorithm must be conducted on the medical cases in real-time to prove the performance of the seizure prediction. In conclusion, the classification problem is evaluated on the discrimination of preictal state samples from other states to qualify the seizure prediction performance in this research work.
Table 5 compares the epileptic seizure prediction performance of the proposed method and existing EESP1, EESP2, and EESP3 models. The evaluated metrics indicate the performance of discrimination between the preictal and interictal classes, exemplifying the seizure prediction performance. The proposed method outperformed the models in
Table 5 and evaluated the prediction performance similar to the real-time scenario using the Leave-One-Out Cross Validation (LOOCV) method during training. The comparative baseline models and EESP research used k-fold cross-validation and train-test split to evaluate the CHB-MIT, Bonn, and NSC EEG datasets.
Figure 9 illustrates the comparative sensitivity and specificity of the proposed seizure prediction method with the existing EESP1, EESP2, and EESP3 works on both the CHB-MIT and Bonn EEG datasets. The distinguishing of the preictal state from the interictal state by the baseline classifiers of KNN, Decision tree, and SVM algorithms had a sensitivity of 54.16%, 82.19%, and 89.39%, respectively, while evaluating the CHB-MIT dataset. Under the scenario with the same number of patients and samples, our proposed method outperformed the deep learning models of CNN and LSTM by 11.46% and 6.8% higher accuracy, respectively. Compared to the EESP1 on the CHB-MIT dataset, the proposed method obtained a 10.54% higher sensitivity and 0.094 comparatively minimal false positive rate. The sensitivity and specificity of our method were comparatively higher than other models and works while testing on three EEG datasets.
As mentioned in
Table 5, the proposed approach on the Bonn EEG dataset reaches an average of 93.94% sensitivity and 0.044 FPR. Overall sensitivity, specificity, and accuracy for the NSC dataset reach 91.11%, 94.24%, and 92.72%, respectively. Thus, the proposed model accurately categorizes preictal and interictal seizure states in the Bonn and NSC datasets and preictal and ictal seizure states in the CHB-MIT dataset. From
Table 5, it is possible to notice the comparatively best results obtained by the proposed method than other methods. Eventhough the EESP3 accomplishes true negative rate as 95.23% which is comparatively higher than the proposed method, the accuracy of the proposed prediction model outperforms the existing researches by improving the true positive rate. All methods were evaluated on three publicly available EEG benchmark datasets of CHB-MIT, Bonn, and NSC locally and globally in the concept of FL. It is arduous to decide which model is better to predict epileptic seizure due to the testing of each method using the limited data of the different patients on different datasets. Hence, the generalizability of the proposed method is tested without the need for patient-specific clinical data and ECG data, referring to the proposed method without the ANFIS-PSO model that is the proposed method with SE, GCNN, and FL.
Furthermore, it is evident from
Table 6 that the combination of SE, GCNN, FL, and ANFIS-PSO-based epileptic seizure prediction model provides higher sensitivity of 96.33% and a higher value of specificity of 96.14% for the CHB-MIT dataset. Also, the sensitivity and specificity in the Bonn and NSC datasets are 93.94% and 96.13%, and 91.11% and 94.24%, respectively. As a result, it is concluded that the recognition of the preictal state is accurate either from the discrimination of the ictal state in the CHB-MIT dataset or the interictal state in the Bonn and NSC datasets. However, there is a marginal variation in the performance measures on different EEG datasets even when the proposed method utilizes the global model parameters for the local model updation due to the variations in time definitions, patients, and epileptic patterns. From examining the performance of baseline models and existing works presented in
Table 6, it is quite apparent that the proposed method comparatively yields better results towards the epileptic seizure prediction on three different datasets.
Compared to the accuracy and specificity, measuring the performance of detecting the preictal class is extremely important in this research, and sensitivity is a significant measure of validating the epileptic seizure prediction method. The results in
Table 6 provide the comparative performance of the proposed method in the centralized and federated approaches. In this research, training in a centralized approach is the learning or processing one EEG dataset at a time, succeeding with gradient computation and weight updation. On the other hand, the federated approach is processing three EEG datasets at once, succeeding with averaging the weights of all the clients.
By introducing the FL in combination with the SE-GCNN model, this work found a 1.56% and 0.51% improvement in sensitivity and specificity while testing on the CHB-MIT dataset. This enhancement is due to the adoption of the global model by the FL model and the segment-aware training sample generation in the proposed system, facilitating the discrimination between the preictal and interictal states. As mentioned in
Table 6, the centralized approach has the worst sensitivity, specificity, accuracy, and false positive rate among the proposed models. Consequently, the learning process in the epileptic seizure prediction system aims to adopt the FL model. Moreover, the proposed system used the ANFIS model in the postprocessing stage, and the results were influenced by the SE, GCNN, and FL models. As a result, the performance of the proposed model has increased in terms of sensitivity to 96.33%, specificity to 96.14%, and accuracy to 96.28% in the CHB-MIT dataset, as mentioned in
Table 5. The false positive rate is also reduced to 0.032. During the postprocessing, the ANFIS-PSO model is tested with different combinations of input data such as i) EEG and ECG, ii) EEG and demographic, and iii) EEG, ECG, and demographic. Thus, the combination of EEG, ECG, and demographic outperforms the other two cases in terms of accomplishing 91.11% sensitivity and 94.24% specificity in the NSC dataset.
Moreover,
Table 7 provides the performance of the proposed method and EESP2 for all the individual patients or subjects in the CHB-MIT dataset. Among all the patients in CHB-MIT dataset, few accomplish comparatively best results; for example, patient CHB03 achieves the highest performance with 98.27% sensitivity, 96.09% accuracy, and 0.071 FPR. In the seizure prediction system, improvement of all the metrics, such as sensitivity, specificity, FPR, and accuracy, are significant. From the analysis of
Table 7, the proposed results of the average specificity range of 91.24% for the different patients. The proposed epileptic seizure prediction with the SE-GCNN and FL model enforces the yielding of the minimal false positive rate due to the spiking sequence-based graph construction and the influence of the generalized pattern-based local model updation. Moreover, the HRV features of the seizure activity accompanied by the EEG-based prediction probability greatly facilitates the achievement of higher sensitivity at 89.84% across all the patients. The performance presented in
Table 7 illustrates that the proposed method ensures stability and maintains the trade-off between the accuracy of all the patients and the accuracy of a single patient.
ROC curves with the AUC score of the proposed model on three benchmark datasets are plotted in
Figure 10.
Figure 10 shows that the proposed model can discriminate the preictal samples from both the interictal and ictal samples in all three CHB-MIT, Bonn, and NSC datasets. The design of three-tier architecture for implementing the FL model in the epileptic seizure prediction system greatly assists the achievement of better AUC scores on different EEG datasets as 0.896, 0.932, and 0.923 by updating the local models with the influence of the global model parameters. As a result, the overall ROC-AUC analysis portrays the significance of the proposed method in ensuring the accurate real-time prediction of all the generalized seizure patients through the modeling of the FL-assisted coarse-grained personalization and ANFIS-assisted fine-grained personalization.
Figure 11 illustrates the ROC curve with the AUC score for each patient tested on five patients of CHB-MIT dataset. From the analysis of
Figure 10 and
Figure 11, it is determined that the proposed epilepsy prediction method accomplishes a higher AUC score in terms of providing accurate seizure prediction for all the patients and the patient-specific seizure prediction. Among five test epileptic patients, the proposed approach accurately predicted the epileptic seizure for a CHB03 patient as 0.961 AUC score through the preictal state discrimination from the ictal samples.