EEG Features *

Time domain
Frequency domain
• Mean value
• Wavelet energy
• Shannon Entropy
• Standard deviation
• AR coefficient 1
• Spectral Entropy
• RMS amplitude
• AR modelling error 1
• Fisher Entropy
• Min value
• AR coefficient 2
• SVD Entropy
• Max value
• AR modelling error 2
• Line length
• AR coefficient 4
• Slope
• AR modelling error4
• Inactive samples
• AR coefficient 8
• Activity
• AR modelling error 8
• Mobility
• Peak frequency
• Complexity
• Total spectrum
• Minima
• Delta (Power Spectral Intensity)
• Maxima
• Theta (Power Spectral Intensity)
• Zero crossings of 1 derivative
• Alpha (Power Spectral Intensity)
• Zero crossings of 2 derivative
• Beta (Power Spectral Intensity)
• Kurtosis
• Normalized Delta (Relative Intensity Ratio)
• Skewness
• Normalized Theta (Relative Intensity Ratio)
• Nonlinear energy
• Normalized Alpha (Relative Intensity Ratio)
• Zero crossings
• Normalized Beta (Relative Intensity Ratio)
• Petrosian Fractal Dimension
• Mean power spectrum
• Higuchi Fractal Dimension
• Intensity weighted mean frequency
• Detrended Fluctuation Analysis
• Intensity weighted bandwidth
• Hurst exponent
• Median frequency
Time Domain Features
When a clinician or researcher examines an EEG recording, it is time domain features that are observed. These time domain features typically encompass the shape and morphology of the signal and can include poorly-defined descriptions of the waveform (such as spikiness, uniformity or degree of asymmetry).
RMS amplitude
The root mean square (RMS) amplitude, or quadratic mean, is a statistical measure of the magnitude of a time varying quantity. The RMS amplitude expresses the mean of the absolute amplitude of an epoch xj and is defined as:
Line length
Line length (L) is used as a measure of signal complexity, initially proposed by Esteller et al. [1] as an indicator of seizure onset.  It is similar to the waveform fractal dimension although it has been shown to be more computationally efficient. Line length (sometimes referred to as curve length) is defined for an epoch xj as:
Thus, line length is the running sum of distances between consecutive points within the sliding window of size ns.
The slope of the EEG signal describes its steepness and is calculated by the first derivative of the signal, where . The mean slope of each epoch is given by the cumulative sum over consecutive sample points:
Inactive samples
The number of inactive samples within an epoch is defined as the number of samples for which there is very little change in the EEG amplitude. This was calculated by applying a threshold of 0.01 to the absolute value of the derivative of the EEG signal.
Activity Mobility and Complexity
In probability theory and statistics the variance of a signal is a measure of how far the numbers in a probability distribution lie from the mean of that distribution. Variance is often referred to as the second central moment and in EEG signal processing is sometimes denoted as activity or the 1st Hjorth Parameter [2]. Variance or activity is thus given by:
where is the sample mean of an epoch and defined as:
The square root of the variance is referred to as the standard deviation σj. Hjorth [2] also introduced two further EEG features, mobility and complexity, based on the standard deviation of the first and second derivatives of the EEG signal, respectively. The Hjorth mobility of an epoch is defined as:
The complexity of an epoch is defined as:
where is the standard deviation of the second derivative of the epoch.
Kurtosis, often referred to as the fourth central moment, is a measure of the “peakedness”
of a probability density function and is defined as follows
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. A negative skew indicates that the tail on the left side of the probability density function (pdf) is longer than that of the right side, with the bulk of the values lying to the right of the mean. Conversely, a positive skew indicates that the tail on the right side is longer than that of the left and the bulk of the values lie to the left of the mean. Zero skewness indicates that the values are relatively evenly distributed either side of the mean, typically (but not necessarily) implying a symmetric distribution. Skewness is often referred to as the third central moment and is defined as:
Nonlinear energy
Non-linear energy (NLE) is a function of the amplitude of a signal, and the change of that amplitude and is defined as:
NLE was introduced by D’Alessandro et al. [3] as a feature in an epileptic seizure prediction algorithm.
Zero crossings, Minima, Maxima, Zero crossings of 1 derivative, Zero crossings of 2 derivative
The number of zero crossings (Zc) is the number of times within an epoch that the EEG signal crosses the x-axis. The number of zero crossings of the 1st derivative of the EEG corresponds to the number of local maxima and minima of the EEG. The number of zero crossings of 2nd derivative corresponds to the number of times that the 2nd derivative of the EEG signal crosses the x-axis within an epoch.
Petrosian Fractal Dimension
Petrosian Fractal Dimension (PFD). To a time series, PFD is defined as
Where N is the series length, and is the number of sign changes in the signal derivative [4]. PFD is a scalar feature.
Higuchi Fractal Dimension
Higuchi Fractal Dimension (HFD). Higuchi’s algorithm [5] constructs k new series from the original series
where m = 1, 2, . . . , k.
For each time series constructed from equation, the length L(m, k) is computed by
The average length is computed as .
This procedure repeats kmax times for each k from 1 to kmax, and then uses a least-square method to determine the slope of the line that best fits the curve of ln(L(k)) versus ln(1/k). The slope is the Higuchi Fractal Dimension. HFD is a scalar feature.
Detrended Fluctuation Analysis
Detrended Fluctuation Analysis (DFA) is proposed in [6].
The procedures to compute DFA of a time series are as follows.
(1) First integrate x into a new series , where and is the average of .
(2) The integrated series is then sliced into boxes of equal length n. In each box of length n, a least-squares line is fit to the data, representing the trend in that box. The y coordinate of the straight line segments is denoted by .
(3) The root-mean-square fluctuation of the integrated series is calculated by
, where the part is called detrending.
(4) The fluctuation can be defined as the slope of the line relating log F(n) to log n.
DFA is a scalar feature.
Hurst exponent
The Hurst exponent (HURST) [7] is also called Rescaled Range statistics (R/S). To calculate the Hurst exponent for time series, the first step is to calculate the accumulated deviation from the mean of time series within range T
, where .
Then, R(T)/S(T) is calculated as
The Hurst Exponent is obtained by calculating the slope of the line produced by ln(R(n)/S(n)) versus ln(n) for n [2..N ]. Hurst Exponent is a scalar feature.
Frequency Domain Features
When a clinician or researcher examines an EEG trace, they explicitly (via a fast Fourier transform (FFT) computed in the visualization software) or implicitly (by observing the periodicity of events in the EEG signal), utilize information from the frequency domain. Consequently, features from the EEG’s frequency domain are used to quantify changes in the spectrum of the EEG during the presence of artefact. The power spectral density (PSD) of an epoch is obtained using a 128 point FFT. The FFT gives an output of ns complex coefficients, which are converted to real values by taking the absolute value of the coefficients. The spectrum of an EEG epoch can be expressed in vector form as frequency coefficients where is the amplitude of a sinusoid of frequency.
Peak frequency
Peak frequency is defined as the frequency corresponding to the largest amplitude in the power spectral density (PSD). It is the dominant frequency component in the EEG signal for that epoch and should characterize to some degree the underlying source signal.
Total spectrum
The total spectrum or power refers to the sum of power in all bins of the PSD between 0 and 12 Hz:
where is the power in bin i of epoch .
Power Spectral Intensity and Relative Intensity Ratio
To a time series, denote its Fast Fourier Transform (FFT) result as. A continuous frequency band from flow to fup is sliced into K bins, which can be of equal width or not. Boundaries of bins are specified by a vector,, such that the lower and upper frequencies of the bin are fi and fi+1, respectively. Commonly used unequal bins are EEG/MEG rhythms, which are, δ (0.5–4Hz), θ (4–7 Hz), α (8–12Hz), β (12–30 Hz). For these bins, we have band = [0.5, 4, 7, 12, 30].
The Power Spectral Intensity (PSI) [8] of the kth bin is evaluated as
where fs is the sampling rate, and N is the series length.
Relative Intensity Ratio (RIR) [8] is defined on top of PSI
PSI and RIR are both vector features.
Intensity weighted mean frequency
The intensity weighted mean frequency is the average frequency from the frequency spectrum, and defined as:
where i is the frequency bin number, is the estimated spectral power in the bin and , with being the sampling frequency and N the total number of frequency bins. The IWMF corresponds to the expected frequency value in an EEG epoch.
Intensity weighted bandwidth
The intensity weighted bandwidth is defined as:
Wavelet energy
The wavelet analysis is a method, which relies on the introduction of an appropriate basis and a characterization of the signal by the distribution of amplitude in the basis. If the wavelet is required to form a proper orthogonal basis, it has the advantage that an arbitrary function can be uniquely decomposed and the decomposition can be inverted [9, 10, 11].
The wavelet is a smooth and quickly vanishing oscillating function with good localization in both frequency and time. A wavelet family is the set of elementary functions generated by dilations and translations of a unique admissible mother wavelet :
where are the scale and translation parameters, respectively, and t is the time. As a increases, the wavelet becomes narrower. Thus, one have a unique analytic pattern and its replications at different scales and with variable time localization.
Since the family {} is an orthonormal basis for , the concept of energy is linked with the usual notions derived from the Fourier theory. Then, the wavelet coefficients are given by the energy at each resolution level j = −1, …,  −N, will be the energy of the detail signal
and the energy at each sampled time k will be
In consequence, the total energy can be obtained by
Then, the normalized values, which represent the relative wavelet energy,
for the resolution level j= −1, −2, …, −N, define by scales the probability distribution of the energy. Clearly, and the distribution {} can be considered as a time–scale density. This gives a suitable tool for detecting and characterizing specific phenomena in time and frequency planes.
AR coefficient and AR modelling error
An autoregressive (AR) model can be used for prediction in a correlated time series. A variable in a correlated time series can be predicted from previous observations in the series by:
where are the parameters of the AR model and is a zero mean, white noise term accounting for the error in each prediction step. The parameters of the AR model are estimated over the first half of the epoch . The AR model is fit to the data over the first half of the epoch using the Yule-Walker method [12] and the model is used to perform one step ahead prediction on the second half of the epoch. The percentage error is then given by:
A total of 8 features are generated using this approach, corresponding to models of orders
In information theory, entropy is a measure of the uncertainty in a random variable. Shannon [13] introduced the concept of entropy in the context of digital communication but it has since proved an effective tool in the prediction and characterization of other signals. Consequently, entropy as introduced by Shannon as well as other information measures are utilized here as features to characterize EEG of differing types.
Shannon Entropy
Shannon entropy is a measure in information theory for estimating the uncertainty of an outcome [13]. It is the average unpredictability in a random variable, which is equivalent to its information content. To calculate Shannon entropy, the signal must first be represented as a discrete distribution. This is performed here by approximating the probability mass function by a 16-bin histogram. The Shannon entropy of the epoch is thus defined as:
where is the magnitude of each bin. If the entropy of is zero, the observer is certain of the future value of. Higher values of entropy then indicate increased uncertainty.
Spectral Entropy
Where the Shannon entropy is used to quantify the order in the EEG signal, spectral entropy is a measure of the order in the frequency spectrum of the EEG:
where i is a frequency index and is a normalised power spectral density :
SVD Entropy
Singular Value Decomposition (SVD) is a measure of the complexity of a signal, often used to obtain information about quasi-periodic signals in noise. The SVD algorithm decomposes a matrix such that:
where A is the input matrix, where U and V have orthogonal columns such that UTU = I and VTV = I, with I being the identity matrix and S is a diagonal vector of singular values. The singular values in S refer to the most significant underlying components in the signal. The number of singular values varies with the complexity of the signal, with an increase in signal complexity leading to a larger number of singular values. The number of significant singular values ζ1...ζdE can be obtained using Rissanen’s Minimum Description Length algorithm [14].
The SVD entropy calculates the entropy in the singular spectrum [14]. By performing SVD for an epoch as described in Equation above, the singular values ζ1...ζdE can be found. The SVD entropy is thus:
where dE is the singular dimension given by Rissanen’s Minimum description length, and where is the normalised singular values such that
SVD entropies should be lower for quasi-periodic signals such as EEG baseline oscillations due to movement.
Fisher Entropy
The Fisher information is calculated from the singular values of the EEG to describe the shape of the singular spectrum.
J. E. T. T. B. L. a. B. P. R. Esteller, Line length: an efficient feature for seizure onset detection. In Proceedings of the IEEE Engineering in Medicine and Biology Conference (EMBC), Istanbul, Turkey, pages 1707–1710, 2001.
B. Hjorth, EEG analysis based on time domain properties. Electroencephalography and Clinical Neurophysiology, 29(3):306–310, 1970.
R. E. G. V. A. H. J. E. a. B. L. M. D’Alessandro, Epileptic seizure prediction using hybrid feature selection over multiple intracranial EEG electrode contacts: a report of four patients. IEEE Transactions on Biomedical Engineering, 50(5):603–615, 2003.
A. Petrosian, “Kolmogorov complexity of finite sequences and recognition of different preictal EEG patterns,” in Proceedings of the 8th IEEE Symposium on Computer-Based Medical Systems, pp. 212–217, June 1995..
T. Higuchi, “Approach to an irregular time series on the basis of the fractal theory,” Physica D, vol. 31, no. 2, pp. 277–283, 1988.
S. H. H. E. S. a. A. L. G. C.-K. Peng, “Quantification of scaling exponents and crossover phenomena in nonstationary heartbeat time series,” Chaos, vol. 5, no. 1, pp. 82–87, 1995.
T. B. a. R. Palaniappan, “A combined linear & nonlinear approach for classification of epileptic EEG signals,” in Proceedings of the 4th International IEEE/EMBS Conference on Neural Engineering (NER ’09), pp. 714–717, May 2009.
S. B. O. A. R. H. G. a. A. R. R. Q. Quiroga, “Searching for hidden information with gabor transform in generalized tonic-clonic seizures,” Electroencephalography 434–439, 1997..
Daubechies I. Ten Lectures on Wavelets. Philadelphia: SIAM, 1992..
Aldroubi A, Unser M, editors. Wavelets in Medicine and Biology. Boca Raton: CRC Press, 1996.
Mallat S. A Wavelet Tour of Signal Processing, second ed. San Diego:Academic Press, 1999..
S.M. Kay. Modern spectral estimation. Pearson Education India, 1988.
C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27(3):379–423, 1949.
W. P. a. I. R. S.J. Roberts, Temporal and spatial complexity measures for electroencephalogram based brain-computer interfacing. Medical and Biological Engineering and Computing, 37(1):93–98, 1999.
* F. S. Bao, X. Liu,C. Zhang, PyEEG:An open source python module for EEG/MEG feature extraction, Comp. Int. and Neurosc. 2011 (2011) (Article ID 406391).
* S. H. O'Regan, Artefact detection and removal algorithms for EEG diagnostic systems, Ph.D Thesis, University College Cork, 2013. Available online at
The online help was made with Dr.Explain