A SOTA EEG-based Emotional Recognition with viewing and auditory stimuli

Tram Ho

Returning to the topic of this article, I will share with everyone a research topic that I am doing recently, EEG also known as brainwave or EEG. EEG is not new, but it is not really popular and research about it are not much. The main reason is the lack of data, machinery and equipment for this problem as well as in addition to the skills and knowledge of AI, signal processing, it also requires other fields related to biomedical and physiology. At present, EEG has many applications, in addition to problems related to human brain and emotions, it is also used to increase the accuracy of the NLP, recommendation system problems. (music …) or even a reference to Infant inheritance , etc. and cloudy clouds. Also in some problems, in addition to EEG, Eye Tracking data is also used, you can refer to the papers that combine these 2 types of data here.

In this article, I will go together to analyze a subset of EEG, Emotional Recognition. The article is based on the content of a paper, using the movie as a source of emotional stimulation, specialized term called sound, pictoral stimulus. People refer here .

*** The article includes the following main contents:

  • Introduction to the EEG
  • EEG contaminations and a number of solutions
  • Introduction to EEG-based Emotional Recognition
  • Building data set
  • EEG Data Acquisition, Preprocessing, Feature Extraction and Normalization
  • Emotion Classification with Feature Selection

I. EEG Introduction

Illustration EEG

Electroencephalography (EEG) is a type of waveform data obtained from the brain using a specialized set of electrodes (also known as electrodes) placed on a person’s scalp. The number of these electrodes will usually be 32, 64, 128 and possibly 2, 14 and 256 poles, the larger the number is proportional to the diversity and granularity of the data that the brain is planning. simulation.

EEG data terms that we should be interested in: Raw, Epochs, Evoked. Where raw is the obtained original data also contains untreated noise, Epochs or Trials are small regions of data segmented from raw or from preprocessed EEGs, which often involve repetitions from sources. Evoked averaged from the Epochs.

EEG is divided into 5 main frequency bands, in which:

  • Delta band: 0 – 3.5 Hz
  • Theta band: 3.5 – 8 Hz
  • Alpha band: 8 – 13 Hz
  • Beta band: 13 – 30 Hz
  • Gamma band:> 30 (usual 30 – 45 Hz)

To simulate the EEG, currently there are a number of tools for each language that support us to do this. With python, I suggest you can use python MNE, this toolkit is quite complete from simulation, preprocessing, sample dataset sets. Matlab has the famous EEGLAB.

The obtained EEG data usually contains a lot of noise, so for the tissue to return good results, the pre-treatment and feature extraction steps are extremely important. In part II, we will go to learn some types of noise and how to deal with them.

II. EEG Contaminations and Solutions

To distinguish EEG contaminations, we will be interested in sources that cause these types of signals, they can be from human physiological activities such as breathing, blinking, heart beating or sweat glands … can also come from foreign sources such as the effects of external environments. In this article we will only talk about the main sources that come from the human body itself.

1. Ocular activity

is a type of visual disturbance, the term used is EOG, derived from actions such as winking, blinking, eye movement. The recognition feature is a sudden wave amplitude change in the 100-200 microvolts range. In the time domain, the change in wave amplitude can be easily seen, see the photo below:

Observing the channels F7, Fpz, F8, F3 we can try to easily see. In the frequency domain, they are easily confused with deltas and theta bands.

2. Muscle activity

Called by the term (EMG), they come from the movements of the human engine. The defining feature is that high frequency wave movements overlap the EEGs in the time domain, the magnitude of the EMG is proportional to the strength of the active muscle group.

Observe the channel C3, Cz on the image. At frequency domains, they are easy to confuse with beta and gamma bands.

3. Cardiac activity

ECGs, as the name implies, are derived from the heart’s actions like contractions that cause signals that interfere with brain waves. Identification characteristics such as waves with periodic up and down rhythms corresponding to heartbeats.

Looking at the picture above we can see where the ECG signal is overlap EEG (red arrow).

4. Respiration

This is due to the movement of the chest and head during respiration. Identifying characteristics of this type of noise can be seen as low frequency waves and varying amplitude with time domain.

In the frequency domain, they are also very confusing with the delta and theta bands.

5. Solutions

There are quite a few ways this can be resolved. The first can be mentioned is

  • EEG artifact rejection : this method allows us to optionally select epochs or trials containing interference to choose from. However, this is not a good option in case we have too little data to analyze and train, because in addition to removing noise in some channels, it also omits EEG clean in other channels.
  • Hence the advent of Filtering helps to remove frequencies outside of a certain range while trying to retain as many EEGs as possible, e.g. lowpass filter, bandpass filter or highpass filter. Alternatively, you can use regression methods in combination with a reference signal to remove EOG and ECG …
  • The Blind Source Separation algorithm’s main idea is to separate the EEG signals into combinations of linear signal sources. One of the most popular algorithms is ICA (Independent Component Analysis).

  • Source decomposition methods Unlike the Blind Source Separation , here it only transforms by channel, instead of all, the channels will be converted into basic waveforms, remove the noise and then reconstruct it to clean EEG data.

III. EEG-based Emotional Recognition

With the emotional recognition problem, there have been experiments using other types of data such as self-report, behavioral responses, physiological measures … however, EEG is said to have the most accurate results. Currently, there are 2 main models for creating / perceiving the emotional space of each person.

  • The dimension model: will be 2 valence-arousal values ​​or 3 valence-arousal-dominance. Where valence denotes a positive or negative state, arousal denotes the level of emotions (how happy, how sad …), and dominance describes whether to be in control or to be controlled.
  • The discrete model contains a finite number of human emotions, e.g. joy, sadness, surprise, fear, anger, disgust … With discrete patterns, it becomes more difficult to perceive. a lot, because in the emotional space, being in the positive state is not entirely the same. Specifically, when we listen to a funny music completely different from a comedy video.

With this paper, the author has improved over the old models by combining these 2 models together to get more optimal results.

IV. Construction of Database

With EEG-based emotion recognition problems, it is quite important that the model uses stimulus as the source of emotional stimulation, the stronger the emotion, the better the model’s perception ability. Popular types of stimuli such as photo, sound or can be combined both as in this paper. The dataset is built on a set of short films, each of two films representing one type of emotion.

List of videos corresponding to each type of emotion

Choosing where the film will be used to receive EEG is also extremely strict. Before these last 16 films were selected, 111 films were pre-selected by 9 research assistances and viewed by 462 volunteers who participated in the viewing and rate according to 3 indicators: SAM (self-assessment manikin), PANAS (positive and negative affect schedule and DES (differential emotion scale).

The SAM index

In which SAM is discrete values ​​for each type of emotion on a scale of 5 or 10 (depending on each paper) in the form of a pictoral-oriented questionaire.

PANAS example

PANAS contains 2 subscales Positive and negative, each subscale will have a finite number of emotions corresponding to it on a scale of 0-5.

The DES will have a scale of 0-9 that describes the degree of emotion in emotional dimension.

V. EEG Data Acquisition, Preprocessing, Feature Extraction and Normalization

In this paper, the author uses Emotiv EPOC system software with 14 electrodes and sampling rate of 128Hz for the acquisition and simulation of EEG data. During the reception, to minimize noise, participants will also be placed in an external disturbance room, instructed to sit in the most comfortable position and have a rest time after each video is viewed.

The acquired EEG is then put through a preprocessing step that includes a bandpass filter in the range (1-45Hz) and an ICA to omit EOG, ECG and EMG.

Okie the next step we need to take care of is feature extraction and normalization. Here, the author uses the Short-time Fourier Transform algorithm with the approach using a time-sliding window on the time-frequency domain (TF). TF analysis helps to represent EEG signals in 2 spectral domains and provides both signal amplitude and frequency information.

To extract and normalize the EEG, the signals are computed based on the concept of event-related desynchronisation (ERD) and synchronisation (ERS), assuming that event-related activities will represent a change in signal strength in one. certain frequency range.

BECAUSE. Emotion Classification with Feature Selection

Before entering the prediction model, one last step is taken is feature selection, the refinement during the analysis, processing and signal selection will help to minimize noise and best features for the training model. training. Here, the author uses LDA (Linear Discriminant Analysis) to select features and then pass through k (k-1) SVM models for 3-level classification training. Where k is the number of emotional labels, each SVM model corresponds to two distinct emotions including Joy, Amusement, tenderness, anger, sadness, fear, disgust and neutrality.

An example of paper prediction results

One thing special compared to previous studies, instead of directly predicting discrete emotions, they will be distinguished as neutrality or non-neutrality which means that they are in a state with emotion or without emotion. If there are emotions, the model will continue to predict, returning either positive or negative status and arousal emotional level value. The final subclass will be the discrete emotional values ​​in the emotional space domain.

VII. Conclusion

Thank you all for reading here, with the content of a paper there will be many shortcomings when it is encapsulated in an article, if you have any questions please comment below for me. Any contributed information is extremely helpful for me to have better quality articles. Thank you!!!

REFERENCES

https://www.researchgate.net/publication/313015621_Real-Time_Movie-Induced_Discrete_Emotion_Recognition_from_EEG_Signals

http://www2.hu-berlin.de/eyetracking-eeg/papersusing.html

https://en.wikipedia.org/wiki/Electroencephalography

https://www.bitbrain.com/blog/eeg-artifacts

https://www.semanticscholar.org/paper/MEG-and-EEG-data-analysis-with-MNE-Python-Gramfort-Luessi/be6638e641c5e993474703de6e0261357da71736

https://www.emotiv.com/

https://mne.tools/stable/index.html

https://sccn.ucsd.edu/eeglab/index.php

Share the news now

Source : Viblo