Samuel Michel: Generalizable Automatic Classification of Sleep Stages

Posted on Sun 30 July 2023 in theses

EEG channel configuration for the selected datasets.

This figure illustrates the EEG channel configuration for the selected datasets. While EDF-ST and EDF-SC share the same channels, only three channels are common between MASS (SS3) and EDF (ST and SC).

The gold standard to diagnose sleep disorders is called polysomnography (PSG). A PSG consists in sleeping one or several nights, at a hospital or a sleep center, while wearing different sensors continuously measuring various temporal data (e.g. electroencephalograms, electrocardiograms, electromiograms, oxymetry, respiration rate, etc.). These data are then used by an expert to annotate the PSG (hypnograph) into the differente sleep phases (paradoxal summation, light, moderate and deep sleep). The hypnograph is then used for sleep disorder diagnosis. The manual annotation process is affected by human limitations: it is time consuming, tedious, not reliable, sensitive to the setup of the different clinics, and to motion noise. Indeed, each sleep center defined his own setup for the PSG. Moreover, it happens that one data is lost due to a motion of the patient during the night (noisy data). Regarding the reliability different studies have shown that for the same PSG two experts may annotated differently. The aim of this work is to investigate the possibility to automate the classification of PSG into the different sleep phases using machine learning. The main concern will focus on the capacity of such algorithms to be faster, and more reliable than manual scoring. To perform this study, two follow-up questions will gravitate around the main scientific question. We will focus on models which are robust to the setup of different clinics, noise and are fair to different populations. One of the steps of our work is therefore to analyse the ability of an automated classifier to manage data coming from different sleep centers. We scoped this study to stateless models that do not take into account temporal context. We investigated both hand-crafted and learnable feature extractors. In terms of intra-database performance, our best model was the CNN Chambon model proposed by Chambon et al. in their paper [Chambon-2018]. However, when evaluating generalization across different setups, the random forest model with manually chosen features described in the same paper emerged as the best model.

Reproducibility Checklist

Thesis report

Software with preset configurations to reproduce published findings.

All databases are publicly available

[Chambon-2018]S. Chambon, M. N. Galtier, P. J. Arnal, G. Wainrib, et A. Gramfort. A Deep Learning Architecture for Temporal Sleep Stage Classification Using Multivariate and Multimodal Time Series. IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26, nᵒ 4, p. 758–769, avr. 2018, https://doi.org/10.1109/TNSRE.2018.2813138.