Key Challenges of EEG Data

Alexey Shovkun joined Neuroscience Software as Chief AI Officer in April 2021, and, together with Alexey Matveev, Senior Data Scientist, is responsible for the company’s Al models with a focus on deep learning and data analysis for its cloud-based software as a medical device. 

Alexey Shovkun joined Neuroscience Software as Chief AI Officer in April 2021, and, together with Alexey Matveev, Senior Data Scientist, is responsible for the company’s Al models with a focus on deep learning and data analysis for its cloud-based software as a medical device. 

Currently, in the majority of mental health clinics, psychiatrists diagnose depression based on the diagnostic criteria on the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) or by using other self-reported measures that are prone to the subjectivity of both professionals and patients. This diagnosis method, as well as an absence of methods based on analysis of depression biomarkers, often leads to misdiagnosis and inadequate treatment [1]. 

Electroencephalography (EEG) is an effective, non-invasive method of studying a patient’s brain activity. Recent studies have demonstrated that different EEG signals (e.g., linear band power, network-based features, evoked potential) and their combinations can be used as biomarkers for depression with up to 99% accuracy [2]. Our goal is to provide a technical solution that will help psychiatrists understand EEG reports and help diagnose, treat, and control the treatment outcomes of patients with depression.

In his first post, Alexey discusses his accomplishments in AI and ML and the challenges that EEG data presents to his team.

Before joining Neuroscience Software, I was Head of Data Science at Fayrix, a development company specializing in remote software teams and bespoke services for startups. My team at Fayrix was responsible for delivering all types of ML solutions to customers from hugely diverse fields, including the commercial and medical sectors.

I have a master’s and a Ph.D. in AI, and I have been conducting data and predictive analyses for over 20 years. I have competed in numerous competitions at Kaggle, the world’s largest community of data scientists and a top platform for hundreds of machine learning competitions, with goals from improving gesture recognition for Microsoft Kinect to making a football AI for Manchester City to improving the search for the Higgs boson at CERN [3]. I achieved the title of Kaggle Competitions Master, as has my colleague Alexey Matveev (details at here and here , respectively). Kaggle has run hundreds of machine learning competitions with thousands of participants from all over the world. The platform has more than 1 million users. Competitions have resulted in many successful unique projects, and several academic papers have been published based on the findings made in Kaggle competitions [3].

Although the encephalographic analysis was relatively new to our DS team, we are betting on fresh insights and out-of-the-box ideas based on AI/ML solutions to help clinicians perform EEG-based depression screening faster and in a more reliable way since depression is often diagnosed through the use of questionnaires, which are prone to professional and patient subjectivity.

The key challenges of EEG data that we ran into at the beginning:

  1. Routine EEG is generally performed with the international 10-20 system of scalp electrode placement, as illustrated below. The system is based on the relationship between the locations of the electrodes and the underlying areas of the brain. As is known from tomography, different brain areas likely relate to different functions. However, scalp electrodes may not specifically be reflective of these particular areas; the exact locations of activity are still open to discussion due to uncertainties caused by, for example, the non-homogeneous properties of the skull, the various orientations of the cortex sources, and the incoherence between the sources [4]. The correct placement of the electrodes, their adequate number, and the EEG cap required based on the specific electrode type (i.e., passive, active, dry, or sponge) are all essential for obtaining high-quality data for subsequent interpretation. Moreover, many alternatives to electrode placement systems exist, particularly with increased numbers of sensors [5]. This wide range of options adds complexity to the task of analyzing collected data.


  1. Encephalographic measurements are obtained via a system that consists of electrodes along with a conductive medium, amplifiers with filters, an A/D converter, and a recording device. Signal quality is absolutely critical to perform effective EEG analysis. To make them compatible with display units, recorders, and A/D converters, signals need to be amplified because they consist of low-amplitude oscillations. However, the amplification of the physiological signal must be selective so that superimposed noise and interference are rejected and so that the patient and equipment are protected from damage caused by voltage or current surges. In addition, many factors affect the signal quality and thus the resulting data, such as bandwidth and the amplifier’s sampling rate [5].


  1. EEG analysis also depends on the method of signal acquisition, which is subdivided into several general groups of tasks and states, such as emotion recognition, motor imagery, mental workload, seizure detection, event-related potential detection, sleep scoring and resting-state analysis. However, the tasks in these groups can number in the thousands, depending on what is being studied; for example, mental workload measurement involves recording EEG data as the subject completes tasks of varying degrees of complexity. Many methods have been used to categorize the workload levels of, for example, driving via simulations and airplane piloting via studies and responsibility tasks based on data about subjects’ behaviors, such as their reaction time and path deviation [6]. It should be noted that EEG analysis is complicated because the state of the patient under study is not static. In addition, the results are inherently noisy, and the electrodes can pick up thinking signals and unwanted physiological signals, such as electrical data from eye blinks and neck muscles. There are also concerns about the motion artifacts that occur when the subject moves [7]. 


  1. In addition to these technical problems, we also face the challenge that most psychiatrists currently diagnose depression using the Diagnostic and Statistical Manual of Mental Disorders or some other self-reported means that are prone to the subjectivity of both professional and patient [8]. For the ML team, this means that the target variable has no singular ‘truth.’ We, therefore, have to look at the whole EEG analysis.

In the next post, I will discuss the challenges posed to my team and the process we undertook to solve them. 

If you are working with similar tasks, please get in touch: We are ready to cooperate and share our experiences with you. 


[1] Koukopoulos, A., Sani, G., & Ghaemi, S. (2013). Mixed features of depression: Why DSM-5 is wrong (and so was DSM-IV). British Journal of Psychiatry, 203(1), 3-5. doi:10.1192/bjp.bp.112.124404 

[2] Ay, B., Yildirim, O., Talo, M. et al. Automated Depression Detection Using Deep Representation and Sequence Learning with EEG Signals. J Med Syst 43, 205 (2019). https://doi.org/10.1007/s10916-019-1345-y

[3] https://en.wikipedia.org/wiki/Kaggle 

[4] Teplan, M. (2002). Fundamentals of EEG measurement. Measurement Science Review, 2(2), 1-11. https://www.measurement.sk/2002/S2/Teplan.pdf

[5] Alik S. Widge, M.D., Ph.D., M. Taha Bilge, Ph.D., Rebecca Montana, B.A., Weilynn Chang, B.A., Carolyn I. Rodriguez, M.D., Ph.D., Thilo Deckersbach, Ph.D., Linda L. Carpenter, M.D., Ned H. Kalin, M.D., Charles B. Nemeroff, M.D., Ph.D.(2019) Electroencephalographic biomarkers for treatment response prediction in major depressive illness: A meta-analysis. The American Journal of Psychiatry, 176(1), 44-56. https://ajp.psychiatryonline.org/doi/10.1176/appi.ajp.2018.17121358

[6] Nunez, P. L. (1995). Neocortical dynamics and human EEG rhythms. New York: Oxford University Press.

[7] Craik, A., He, Y., & Contreras-Vidal, J. L. (2019) Deep learning for electroencephalogram (EEG) classification tasks: A review. Journal of Neural Engineering, 16(3),  https://iopscience.iop.org/article/10.1088/1741-2552/ab0ab5

[8] Hosseinifard, B., Moradi, M. H., & Rostami, R. (2013). Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal. Computer Methods and Programs in Biomedicine, 109(3), 339-345.

Back to blog list

Last posts