Mental state and emotion detection from musically stimulated EEG

This literature survey attempts to clarify different approaches considered to study the impact of the musical stimulus on the human brain using EEG Modality. Glancing at the field through various aspects of such studies specifically an experimental protocol, the EEG machine, number of channels investigated, feature extracted, categories of emotions, the brain area, the brainwaves, statistical tests, machine learning algorithms used for classification and validation of the developed model. This article comments on how these different approaches have particular weaknesses and strengths. Ultimately, this review concludes a suitable method to study the impact of the musical stimulus on brain and implications of such kind of studies.


Introduction
The human brain is a spectacularly complex organ, how the brain processes an emotion having very little acquaintance. Discovering how the brain processes the emotion will impact not only in artificial emotional intelligence, human-computer interface but also have many clinical implications of diagnosing affective diseases and neurological disorders. There are several multidisciplinary and collaborative researches across the globe happening using different modalities of brain research and to investigate how the brain processes emotion. There are many ways to evoke the emotion; music is the excellent thriller and elicitor of emotion [1]. During listening unique music the physiological responses of subjects like shivering, speeding heart, goosebumps, laughter, lump in throat, sensual arousal and sweating [2]. Tuning in to the music incorporates different mental means, for example, observation, multimodal combination, focus, reviewing memory, syntactic handling and the preparing of significant data, activity, feeling and social discernment [3]. Thus, music is a potent stimulus for evoking the emotions and investigating processing functions of the human brain.
There are several modalities of brain research categorised depending on how it measures neuronal activity of the brain, direct imaging, and indirect imaging, direct imaging measures electrical or magnetic signal generated due neuronal activity directly, e.g. EEG (electroencephalogram) and MEG (magnetoencephalogram), whereas indirect imaging fMRI (functional magnetic resonance imaging), PET (positron emission tomography), etc., measure neuronal activity using oxygen consumption of neurone. Indirect measuring has an excellent spatial resolution in case of PET around 4 mm and f MRI 2 mm but low temporal resolution low for PET 1-2 min and fMRI 4-5 s [4] and other enlisted disadvantages -Subject has to take radionuclide dye -Claustrophobic -Noisy -Mostly used for clinical research purpose -Highly expensive machine cost ($20,00,000-800,000) and scanning cost ($800-1500.) [4] Direct imaging reasonable good spatial resolution and excellent temporal resolution 1 ms in case of EEG its 10 mm but having several advantages to carry the stimulusbased experiment [4] Open Access The paper is written using an approach of a summary of reviews, an analysis of surveying aspects and synthesis of reviewing aspects and organised in sections as follows: Sect. 2 covers the structural information of brain, Sect. 3 represents literature selection and analysis, Sect. 4 shows summary of review, and Sects. 5, 6 and 7 represent discussion, suggested approach and conclusion, respectively 2 Functional structure of the brain Before understanding EEG signals, we need to understand the structure of the brain. The human brain conveyed into three critical parts: cerebrum, cerebellum and cerebrum stem. Cerebrum subdivided into frontal lobe, parietal lobe, temporal lobe, occipital lobe, insular and limbic lobe alludes Fig. 1. Each part connected with some mental capacity, for example, the parietal projection sees agony and taste sensations and is associated with critical thinking exercises. The temporal lobe worried about hearing and memory. The occipital lobe primarily contains the districts utilised for vision-related errands. The frontal lobe principally connected with feelings, critical thinking, discourse and movement [6,7]. A grown-up human brain holds, on an average, 100 billion neurons [8]. Neurons process and send data through electrical and chemical signals due to this it generates neuronal oscillations called brainwaves or EEG signals. Table 1 shows electrical and functional characteristics of these waves. The frequency range of EEG signals is 0.5-100 Hz, whereas amplitude range is 10-100 μV [9]. Delta wave has highest amplitude and lowest frequency, whereas gamma waves have highest frequency and lowest amplitude. In reviews, the frequency range varies by ± 0.5-1 Hz.

Literature selection and analysis
The keywords used to select the article were EEG and Music and Emotions on a repository like PubMed, IEEE explorer, Science Direct and Mendeley research tool. Library recognised quality twenty-two papers from the year of 2001 and 2018 created using Mendeley [10]. The mostly followed research methodology is shown in Fig. 2. The articles were analysed concerning general steps observed in an experiment such as participants, stimulus, EEG machine, channel, montages preprocessing, feature extraction, statistical testing and machine learning.

Summary of reviews
This section summaries findings and outcomes all the selected articles.
For a musical stimulus which was known to fluctuate in full of affective valence (positive versus negative) and intensity (extreme versus quiet), the author found that the pattern of asymmetrical frontal EEG activity. A higher relative left frontal EEG movement to satisfaction and cheerful melodic entries and more prominent relative right frontal EEG action to fear and dismal melodic selections. The author additionally discovered EEG asymmetry distinguished the intensity of emotion [11]. For the distinctive stimuli excerpt, jazz, rock-pop, traditional music and environmental sound. Author discovered positive enthusiastic attributions were joined by an expansion in left fleeting initiation, negative by a more two-sided design with predominance of the privilege fronto-temporal cortex. Author additionally discovered female members affirmed more prominent valence-related contrasts than males [12]. In this research, wonderful and offensive feelings were evoked by consonant and cacophonous Fig. 1 Functional diagram of brain diagram is adopted from [5] melodic portion creator discovered lovely music was related with increment in frontal mid-line θ power [13]. In this examination, EEG-based emotion classification algorithm was explored utilising four types musical excerpts. The hemispheric asymmetry α power indices of brain activation were extracted as feature [14]. The author examined the connection between EEG signs and music-initiated emotion responses using four emotional music excerpts (Oscar film track). The author found that low-frequency bands δ , θ and α are correlate of evoked emotions [15]. In this examination the author researches spatial and spectral pattern for evoked feelings because of melodic passage. Author found that spatial and spectral pattern most significant to feeling and reproducible crosswise over subjects [16]. In this investigation, the author distinguished 30 subject-free features that were most connected with emotion processing crosswise over subjects and investigated the convenience of utilising less electrode to describe the EEG flow amid music listening [17]. For stimulus rock-pop, electronic, jazz and broadband noise author examined the relation between subjects' EEG responses to self-evaluated enjoyed or loathed music. Movement in β and γ band may prompt a relationship between music inclination and enthusiastic excitement phenomena [18]. In this article, author found frequency band, beta and theta, perform superior to anything other frequency band [19]. The author investigated like and disliked under three cases familiarity of the music by taking three types music regardless of familiarity of music, familiar music and unfamiliar music. The author found that familiar music gives highest classification accuracy compared to regardless familiarity and unfamiliar music [20]. The authors found that among the musician and non-musician subjects participated in  Experimental approach adopted in reviews the research, musicians have significantly lower frontal γ activity during music listening and music imaging than resting state [21]. Author classified euphoric versus non-partisan, upbeat versus melancholic and well-known versus new melodic selection. The author researched brain network related to happy, melancholic and unbiased music. The authors research inter/intra provincial network designs with the self-announced assessment of the melodic selection [22]. The author found that among members thirty people of three distinctive age gatherings (15-25 years, 26-35 years and 36-50 years). The brain signals of age gathering (26-35 years) gave the best emotion acknowledgement accuracy in understanding to the self-reported emotions [23]. Author proposed a novel user identification framework using EEG signals while tuning in to music [24]. Authors quantify emotional arousal corresponding to different musical clips [25]. Author suggests that unfamiliar songs are most appropriate for the construction of an emotion recognition system [26]. The author explores the impact of Indian instrumental music Raag Bhairavi using frontal theta asymmetry [27]. The author proposes the frontal theta asymmetry model for estimating valence of evoked emotions and also suggested electrode reduction for neuromarketing applications [28,29]. Author proposes frontal theta as biomarker of depression [30].

Handedness
Human brain has two identical anatomical spheres, but each sphere has functional specialisation. Handedness is concept which by simplistic definition is prominent hand used in day-to-day activity [31]. Each hemisphere has specific prominent function, like language abilities in left hemisphere in right-handed person [32]. As we are probably aware that the brain is cross-wired, the left side of the hemisphere of the cerebrum controls the right side of the body and vice versa in the majority of people. In research involving brain and stimuli, we first need to know about handedness as it is an indicator of prominent hemisphere. As a prominent hemisphere has specialised functions; observations, findings, interpretation differ according to dominance. Many functions change hemisphere as per dominance in particular person. Like, left-handed people have language processing in right hemisphere and right-handed have in left hemisphere [33]. Brain pattern of right-and left-handed persons are different [34]. This section analyses the natures of subjects considered in the reviews

Participants
Subjects used 5-79 with median 20 most of the researchers consider unbalanced numbers of males and females see Table 2. When subjects participated in studies are less, outcome of the hypothesis is always questionable. In 78% of research, authors reported right-handed subjects without any handedness inventory. Only 22% of research used handedness Edinburgh inventory [37,38]. In most of the investigation, 95% researcher recruited normal participants; few of them verify the normalcy. Most of the researchers selected participants who are the students or working staff and of the same background. Author [23] investigated the impact of the musical stimulus on a different age group. Author [21] studied the effect of the musical stimulus by recruiting musician and non-musician subject. Authors [30,35] investigated the impact of the musical stimulus on mentally depressed subjects.

Musical stimulus type, duration and emotions
Different genres of musical stimulus excerpt of pleasant and unpleasant music selected to evoke a different types emotions stimulus chosen are classical, rock, hip-hop, jazz, metal, African drums, Oscar tracks, environmental (refer Table 3). Author [13,18] used noise along with pleasant stimulus to elicit negative emotion. Authors [20] used familiar unfamiliar and regardless familiar music. Stimulus duration selected from 2 s to 10 min with median 30 s. Different excerpts interleaved with some time gap. Selfresponses of evoked emotion noted from subjects participated in study. Emotions investigated the positive and negative emotions such as Fear, Happiness, Sadness, Anger, Tiredness, Like, Dislike, Anxiety, and Depression. Some authors used feel tracer to measure arousal effect of the stimulus.

EEG machine and channel investigated
Twelve different EEG machines are used in the reviewed articles (refer Tables 4 and 5 . 3). Electrode is used in the reviewed articles 1-63 with a median 21.5. Total of 75% article reported referential montages taking A1 and A2 reference electrodes. Author [11,35] used vertex electrode Cz as reference.
Author [20] used frontal mid-line electrode Fz as well as A1 and A2 reference electrodes. Author [18] used Laplacian montage.

Preprocessing for artefact and feature extraction
Most of the articles reported manual, and offline removal artefact; few articles used filter and Laplacian montage method [19]. The notch filter was also used to remove features extraction transform. Most of the articles used FFT either DFT or STFT (56.25% ); 12% researchers used wavelet transform and 6.25 % researcher applied DFA and time domain analysis. Author [18] applied time-frequency transform (Zhao-Atlas-Marks, STFT, Hilbert, Huang Spectrum) (refer Table 6).

Brainwave and location investigated and statistical test
31.25% researchers investigated all brainwaves ( δ , θ , α , β and γ ) together. Remaining of them selected few of them or independently studied a single band. In all reviews α = 75% , γ = 37.5% , β = 56.25% , δ = 37.5% and θ = 87.5% were investigated. Almost all researchers investigated frontal hemisphere only. Author [20] investigates all regions of the brain and correlates γ waves with memory processing. Twenty five per cent reviews conducted statistical tests, namely ANOVA, t test and Z test.
Most of the authors consider confidence level of 0.05. Seventy-five per cent reviews directly applied machine learning algorithm (refer Table 7).

Machine learning algorithms
In all, 72%% reviews employed supervised learning algorithm, namely k-NN, SVM, MLP, LDA, QDA, HMM, self-responses of subjects used as a feature vector. Twenty-eight per cent% reviews used statistical tests, namely t test, ANOVA and Z test. Forty per cent% of reviews used SVM along with other classifiers for classifying emotions. Classification accuracy is the most used metric. No study reported unsupervised machine learning algorithms (see Table 8).

Participants
The vast majority of the engineering domain study consider very less subject on an average 11 approximately, especially articles on IEEE explorer. To prove the hypothesis, minimum 30 subjects are required in the study [39]. In case scholars use subjects of both sexes, the number of subjects should be equal. Most of the authors required normal subjects without confirming normalcy of subjects. Homogeneous population were considered. This study is multidisciplinary study human factor, and experimental psychology is involved in this [40]. Most of the studies conducted by engineering fraternities are without clinical guidance. Handedness not considered if it considers evasive about handedness evaluation method.

Musical stimulus and dimension of emotion
Reviews use various genres of musical stimuli. To evoke different emotions among the subjects, a different emotional excerpt of incentives was employed. Most of the reviews employed familiar musical stimulus. Author [26] empirically proved unfamiliar excerpt most suitable for the construction of an emotion identification system. In reviews, various emotions are considered for emotion classification. The higher number of emotions causes emotion acknowledgement troublesome, and a few emotions may overlap [41]. In most surveys, the 1-dimensional emotion model was used. To investigate arousal feel tracer used feel tracer instrument is not reliable [42]. No reviews report about an automatic prediction of valence and arousal of 2 dimensional for the same excerpt of musical stimuli. High-frequency brainwaves like beta and gamma were used to correlate arousal [43] of emotion, while low frequency like alpha or theta for valence of emotion [11,13]. Arousal and valence for the same excerpt of stimulus were plotted on the same graph as shown in Fig. 4.

Emotional processing in depression
Emotions are broadly classified as positive and negative, for sake of their understanding in processing in brain. Broadly, it is seen that positive emotions are processed in left anterior hemisphere (a.k.a. prefrontal cortex) of brain and negative emotions are processed in right [44]. In cases of depression, hypothesis in left anterior hemisphere hypo-arousal or right anterior hemisphere hyper-arousal leads to symptoms of depression [45]. EEG pattern supports evidence; findings shows that in cases of depression left anterior hemisphere is relatively inactive to right hemisphere [27], indicating that patients with depression have differential processing of stimuli than people without depression.

EEG machine and montages
While selecting EEG machine, following features should be considered

Reference
Preprocessing approach Feature extraction [11] Offline manual FFT [12] Offline manual Time domain [13] Offline manual FFT [14] Filter of 0-100 Hz, notch filter of 60 Hz and offline manual STFT [15] Filter of 0-100 Hz, notch filter of 60 Hz and offline manual Montages are sensible and efficient game plans of electrode sets called channels that show EEG action over the whole scalp, permit appraisal of movement on the two sides of the cerebrum (lateralization) and aid in localisation of recorded activity to a specific brain region [46] -Bipolar Montage In a bipolar montage, each waveform signifies the difference between two adjacent electrodes. This class of montage is designated as longitudinal bipolar (LB) and transverse bipolar (TB). Longitudinal bipolar montages measure the activity between two electrodes placed

-Laplacian Montage
In this montage, the distinction between a electrode and a weighted normal of the encompassing electrodes is utilised to represent a channel.

Preprocessing for artefacts
EEG recording is exceedingly powerless to various forms and sources of noise. Morphology, an electrical characteristic of artefacts, can lead to significant difficulties in analysis and interpretation of EEG data. Table 9 shows various types of artefacts. The morphology of external artefacts is easily distinguishable from actual EEG [47]. Taking long duration and using many electrode artefactfree recording protocol is the best strategy for preventing and minimising all types of artefacts [27] -Educate the members around an eye, physical movement -Try not to permit electronic contraption in EEG recording lab -Record in acoustic free, diminish light and at surrounding temperature -All muscle, ocular or movement artefact slots of EEG signals reject -Members wash their hair to expel oil from their scalp.
-Use proper montage  Time-frequency examination contains those procedures that review a signal in both the time and frequency at the same time, appropriate for event-related emotion acknowledgement.

Brainwave and location
In existing literature, a frontal region mostly explored as it associated with emotion processing. A few researchers investigated an exclusive wave correlating evoked emotion. As mentioned in Sect. 1, musical stimulus created many psychological changes in subjects only examining frontal region, and few are the wave is not enough in creating a model of evoked emotion. Various lobes and many waves establishing their interrelationship need to be explored.

Machine learning algorithm
SVM is a supervised machine learning algorithm which can be used for classification or regression problems. It is a suitable algorithm for classification of evoked emotions. SVM utilises kernel trick to transform the data, and after that, because of these changes, it finds an ideal limit between the conceivable yields. Nonlinear kernel tricks can catch substantially more perplexing connections between data points without having to perform difficult transformations on own [48]. It has features -High prediction speed -Fast training speed -High accuracy -Results are interpretable -Performs wells with small numbers of observation

Model performance metrics
Healthcare and engineering models have different obligations, so the assessment metric should be different and should not be judged using a single metric; classification accuracy metrics are mostly considered in reviews for assessing the model. The model performance represented in the form of the confusion matrix is shown in Eq. (1). (1)

Cp Tp Fp Fn Tn
Assume the inadequate model shown by Eq. (3) is having true-positive and false-positive values zero; still model classification accuracy by Eq. (2) is 83.33%. Accuracy is not a reliable metric for assessment of model. Apart from classification accuracy, there are many metrics for models assessment such as sensitivity, specificity, precision NPV (negative prediction value), FDR (false discovery rate), F1 score, FPR (false-positive rate), FNR (false-negative rate) accuracy, MCC (Mathew correlation coefficient) informedness (Youden index), markedness and ROC (receiver output character). Model performance metric such as recall, specificity, precision and accuracy are biased metrics [49]. ROC diagrams depicted the trade-off inside hit rates and false alert rates of classifiers and honed for the long time [50,51]. As ROC decouples models performance from class skew and error costs, this makes ROC best measure of classification performance. The ROC graphs are useful for building the model and (2) Accuracy = Tp + Tn Tp + Tn + Fp + Fn formulating their performance [52]. For a small number of positive class, F1 and ROC give a precise assessment of models [53, 54].

Suggested approach
As this research is interdisciplinary collaborative research by involving the medical fraternity of psychiatry or neurology background, music expert will satisfy Brouwer's [40] recommendation I, II and VI. By recording EEG in three continuous gatherings, prestimulus, during stimulus and post-stimulus, could help in comparing with the baseline changes, and post hoc selection of data satisfy Brouwer's recommendation III. moreover, remaining Burrowers recommendation IV and V by recording EEG using good artefact removing the protocol mentioned in Sect. 4.4 and Table 9. Analysing data using proper statistical test and machine learning algorithms (refer Fig. 6 for suggested approach). Comparison of left and right hemispheric activity refer gives vivid results, and the model formed called asymmetry model (refer Fig. 5). Most of the reviews compared left brain activity with the right brain activity and found that mathematical relationship for stimulus will be more significant.