Epilepsy is a disorder which affects the human brain and hugely impairs patients’ daily lives. It is characterized by recurrent and sudden incidence of epileptic seizures [1]. According to an estimation of the World Health Organization, more than 50 million of population are affected by epilepsy [2, 3]. Approximately, almost 1 % population have the neurological disorders [4–6]. It leads to numerous research works to identify epilepsy and related treatments. Electroencephalogram (EEG) signals have been proved as a powerful tool for detecting and diagnosing different neurological diseases. EEG signals are often used to detect and classify epilepsy [7]. It is often difficult for the experts to recognize the people who have a brain disorder through visual inspection of EEG signals [8]. In addition, visual inspection for discriminating EEG signals is a time consuming, error prone, costly process and not sufficient enough for reliable information. The analysis and classification of EEG signals can lead to better diagnostic techniques for brain-related disorders. It is thus important to develop better EEG classification methods.
Many researchers developed new techniques to extract the significant information from EEG signals. The information is used as the input to different classifiers. There are many approaches used to extract the key features as well as to further select features. Most of these fall under five broad categories: time domain, frequency domain, time–frequency domain, traditional non-linear methods and graph theory approaches [9].
One of the methods used in this paper for extracting epileptic EEG data is sample random sampling (SRS) technique. Researchers often applied the SRS in time domain. In this technique, each sample of the population has the same chance to be selected as a subject. The complete process of sampling is done in a single step, with each subject can be selected independently from the other samples of the population [10]. Then, we forwarded all these samples to the sequential feature selection (SFS) method for selecting the best features.
This study uses the selected features as the input for a classifier. One of the most popular classifiers, the least square support vector machines (LS_SVMs) [11], is used to classify EEG data. This technique is used to identify the EEG data from healthy people and epileptic patients for epileptic seizures.
A lot of approaches for EEG signals classification have been developed [12]. There were reported a diverse of classification precisions for epileptic EEG data. Brief discussions of the previous research are provided below.
Gajic et al. [13] extracted different features from time, frequency, time–frequency domain and non-linear analysis.
These features were obtained from sub-bands with good representative characteristics. The researchers reduced the dimension of the features by using scatter matrices. This method yielded 98.7 % accuracy.
An optimum allocation-based principal component analysis method was proposed by Siuly and Li [8] to extract key features for the classification of multi-class EEG signals from epileptic EEG data. They used four different classifiers which were LS_SVM, naive Bayes classifier, k-nearest neighbour (KNN) algorithm and linear discriminant analysis, to find out which one was the best classifier. They used four different output coding approaches for the multi-class LS_SVM. These were error correcting output codes, minimum output codes, one versus one (1vs1) and one versus all. That method achieved a 100 % accuracy with LS_SVM_1vs1.
Feature extraction was carried out through an empirical mode decomposition. The extracted features were forwarded to two classifiers, the classification and regression tree and the C4.5 classifiers. The method using the C4.5 classifier suggested by Martis et al. [14] obtained good experimental results of 95.33, 98 and 97 % for accuracy, sensitivity and specificity, respectively.
Chua et al. [15] gained features from raw EEG recordings by using higher order spectra. They used a Gaussian mixture model (GMM) and a SVM classifiers to detect epileptic EEG signals. They achieved average accuracies of 93.11 and 92.56 % for the HOS based GMM classifier and the SVM classifier, respectively, for different EEG classes, such as normal, pre-ictal and epileptic EEGs.
On the other hand, a genetic algorithm (GA) was used by Guo et al. [16] to automatically extract features from EEG data in order to enhance the classifier’s performance, as well as, to reduce the feature’s dimensionality. They used two groups of epileptic datasets. The first group was two classes of healthy people and epileptic patients. The second group was three classes of healthy people, inter-ictal and ictal. The KNN classifier was used in the work to classify the two groups. They gained 88.6 and 99.2 % accuracies for the first group without GA and with GA, respectively. They obtained of a 67.2 % accuracy without GA, and 93.5 % within GA, respectively, for the second group.
Ocak decomposed EEG signals, which were recorded from normal subjects and epileptic patients, by using discrete wavelet transform [17]. An approximate entropy (ApEn) was extracted from the approximation and the detail coefficients. The methodology achieved more than 96 % accuracy.
Srinivasan et al. used the ApEn to extract features and an artificial neural network classifier to identify epileptic EEG signals [18]. That approach achieved a high overall accuracy of 100 %.
Srinivasan et al. also proposed a special type of recurrent neural network, Elman network [19]. They used the feature extracted in time domain and frequency domain as the input to the proposed classifier. The Elman network method yielded a 99.6 % accuracy with a single input feature.
A wavelet transform method was used by Gajic et al. [20] to extract the key features. They also used scatter matrices to reduce the dimensionality of the features. These features were used as the input to a quadratic classifier. The EEG epileptic database was classified into healthy subjects, epileptic subjects during a seizure-free (inter-ictal) and epileptic patients during the seizure activity (ictal). They obtained a 99 % classification accuracy.
Shen et al. [12] proposed a cascade of wavelet-ApEn for feature selection. They used Fisher scores for adaptive feature selection, and SVM for feature classification to detect epileptic seizures. They applied the method to different epileptic EEG recordings: open source EEG data and clinical EEG data. The method obtained the overall classification accuracies of 99.97 and 98.73 %, respectively.
A sampling technique (ST) based on a LS_SVM was proposed by Siuly et al. [21]. Firstly, they used the ST to extract features from two classes of, normal persons with eyes open and epileptic patients during a seizure activity. They applied the LS_SVM to the extracted features. The total classification accuracy by that approach for both the training and testing datasets was 80.31 and 80.05 %, respectively.
Husain and Rao [22] presented an artificial neural network model using back propagation algorithm for the classification of epileptic EEG signals. They decomposed the EEG signals into a finite set of band limited signals termed as intrinsic mode functions. They also applied Hilbert transform on these intrinsic mode functions to calculate instantaneous frequencies. They achieved a 99.80 % overall classification accuracy.
Rückstieß et al. [23] performed a SFS method to select the most representative features at each time step. Each successive features depended on the previous features. All the features were put into one vector and were forwarded to a classifier. This approach was applied for handwritten digits classification and a medical diabetes prediction task.
A sequential floating forward selection (SFFS) algorithm was proposed to detect epileptic seizures in EEG signals by Choi et al. [24]. They selected the most energy power as the features from frequency bands by using the SFFS algorithm. The total accuracy obtained by that method was 97.2 %.
In this study, we developed a new method combining the SRS with the SFS to acquire the best features set, and then we use the features as the input of the LS_SVM classifier for the EEG classification. All the techniques are discussed in Sects. 3 and 4. The conclusion is presented in Sect. 5.