Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
153
result(s) for
"Westover, M. Brandon"
Sort by:
Expert-level automated sleep staging of long-term scalp electroencephalography recordings using deep learning
by
Sarkis, Rani A
,
Pellerin, Kyle R
,
Westover, M Brandon
in
Algorithms
,
Automation
,
Big Data Approaches to Sleep and Circadian Science
2020
Abstract
Study Objectives
Develop a high-performing, automated sleep scoring algorithm that can be applied to long-term scalp electroencephalography (EEG) recordings.
Methods
Using a clinical dataset of polysomnograms from 6,431 patients (MGH–PSG dataset), we trained a deep neural network to classify sleep stages based on scalp EEG data. The algorithm consists of a convolutional neural network for feature extraction, followed by a recurrent neural network that extracts temporal dependencies of sleep stages. The algorithm’s inputs are four scalp EEG bipolar channels (F3-C3, C3-O1, F4-C4, and C4-O2), which can be derived from any standard PSG or scalp EEG recording. We initially trained the algorithm on the MGH–PSG dataset and used transfer learning to fine-tune it on a dataset of long-term (24–72 h) scalp EEG recordings from 112 patients (scalpEEG dataset).
Results
The algorithm achieved a Cohen’s kappa of 0.74 on the MGH–PSG holdout testing set and cross-validated Cohen’s kappa of 0.78 after optimization on the scalpEEG dataset. The algorithm also performed well on two publicly available PSG datasets, demonstrating high generalizability. Performance on all datasets was comparable to the inter-rater agreement of human sleep staging experts (Cohen’s kappa ~ 0.75 ± 0.11). The algorithm’s performance on long-term scalp EEGs was robust over a wide age range and across common EEG background abnormalities.
Conclusion
We developed a deep learning algorithm that achieves human expert level sleep staging performance on long-term scalp EEG recordings. This algorithm, which we have made publicly available, greatly facilitates the use of large long-term EEG clinical datasets for sleep-related research.
Journal Article
Algorithm for automatic detection of self-similarity and prediction of residual central respiratory events during continuous positive airway pressure
by
Westover, M Brandon
,
Oppersma, Eline
,
Thomas, Robert J
in
Algorithms
,
Automation
,
Continuous Positive Airway Pressure
2021
Abstract
Study Objectives
Sleep-disordered breathing is a significant risk factor for cardiometabolic and neurodegenerative diseases. High loop gain (HLG) is a driving mechanism of central sleep apnea or periodic breathing. This study presents a computational approach that identifies “expressed/manifest” HLG via a cyclical self-similarity feature in effort-based respiration signals.
Methods
Working under the assumption that HLG increases the risk of residual central respiratory events during continuous positive airway pressure (CPAP), the full night similarity, computed during diagnostic non-CPAP polysomnography (PSG), was used to predict residual central events during CPAP (REC), which we defined as central apnea index (CAI) higher than 10. Central apnea labels are obtained both from manual scoring by sleep technologists and from an automated algorithm developed for this study. The Massachusetts General Hospital sleep database was used, including 2466 PSG pairs of diagnostic and CPAP titration PSG recordings.
Results
Diagnostic CAI based on technologist labels predicted REC with an area under the curve (AUC) of 0.82 ± 0.03. Based on automatically generated labels, the combination of full night similarity and automatically generated CAI resulted in an AUC of 0.85 ± 0.02. A subanalysis was performed on a population with technologist-labeled diagnostic CAI higher than 5. Full night similarity predicted REC with an AUC of 0.57 ± 0.07 for manual and 0.65 ± 0.06 for automated labels.
Conclusions
The proposed self-similarity feature, as a surrogate estimate of expressed respiratory HLG and computed from easily accessible effort signals, can detect periodic breathing regardless of admixed obstructive features such as flow limitation and can aid the prediction of REC.
Journal Article
Artificial intelligence in sleep medicine: background and implications for clinicians
by
Kristo, David A.
,
Berry, Richard B.
,
Westover, M. Brandon
in
Accuracy
,
Agreements
,
Algorithms
2020
Polysomnography remains the cornerstone of objective testing in sleep medicine and results in massive amounts of electrophysiological data, which is well-suited for analysis with artificial intelligence (AI)-based tools. Combined with other sources of health data, AI is expected to provide new insights to inform the clinical care of sleep disorders and advance our understanding of the integral role sleep plays in human health. Additionally, AI has the potential to streamline day-to-day operations and therefore optimize direct patient care by the sleep disorders team. However, clinicians, scientists, and other stakeholders must develop best practices to integrate this rapidly evolving technology into our daily work while maintaining the highest degree of quality and transparency in health care and research. Ultimately, when harnessed appropriately in conjunction with human expertise, AI will improve the practice of sleep medicine and further sleep science for the health and well-being of our patients.
Citation:
Goldstein CA, Berry RB, Kent DT, et al. Artificial intelligence in sleep medicine: background and implications for clinicians.
J Clin Sleep Med
. 2020;16(4):609–618.
Journal Article
Refining sleep staging accuracy: transfer learning coupled with scorability models
by
Westover, M Brandon
,
Thomas, Robert J
,
Ganglberger, Wolfgang
in
Adult
,
Agreements
,
Automation
2024
Abstract
Study Objectives
This study aimed to (1) improve sleep staging accuracy through transfer learning (TL), to achieve or exceed human inter-expert agreement and (2) introduce a scorability model to assess the quality and trustworthiness of automated sleep staging.
Methods
A deep neural network (base model) was trained on a large multi-site polysomnography (PSG) dataset from the United States. TL was used to calibrate the model to a reduced montage and limited samples from the Korean Genome and Epidemiology Study (KoGES) dataset. Model performance was compared to inter-expert reliability among three human experts. A scorability assessment was developed to predict the agreement between the model and human experts.
Results
Initial sleep staging by the base model showed lower agreement with experts (κ = 0.55) compared to the inter-expert agreement (κ = 0.62). Calibration with 324 randomly sampled training cases matched expert agreement levels. Further targeted sampling improved performance, with models exceeding inter-expert agreement (κ = 0.70). The scorability assessment, combining biosignal quality and model confidence features, predicted model-expert agreement moderately well (R² = 0.42). Recordings with higher scorability scores demonstrated greater model-expert agreement than inter-expert agreement. Even with lower scorability scores, model performance was comparable to inter-expert agreement.
Conclusions
Fine-tuning a pretrained neural network through targeted TL significantly enhances sleep staging performance for an atypical montage, achieving and surpassing human expert agreement levels. The introduction of a scorability assessment provides a robust measure of reliability, ensuring quality control and enhancing the practical application of the system before deployment. This approach marks an important advancement in automated sleep analysis, demonstrating the potential for AI to exceed human performance in clinical settings.
Graphical Abstract
Graphical Abstract
Journal Article
Sleep staging from electrocardiography and respiration with deep learning
by
Tesh, Ryan A
,
Panneerselvam, Ezhil
,
Leone, Michael J
in
Big Data Approaches to Sleep and Circadian Science
,
Cable television broadcasting industry
,
Deep Learning
2020
Abstract
Study Objectives
Sleep is reflected not only in the electroencephalogram but also in heart rhythms and breathing patterns. We hypothesized that it is possible to accurately stage sleep based on the electrocardiogram (ECG) and respiratory signals.
Methods
Using a dataset including 8682 polysomnograms, we develop deep neural networks to stage sleep from ECG and respiratory signals. Five deep neural networks consisting of convolutional networks and long- and short-term memory networks are trained to stage sleep using heart and breathing, including the timing of R peaks from ECG, abdominal and chest respiratory effort, and the combinations of these signals.
Results
ECG in combination with the abdominal respiratory effort achieved the best performance for staging all five sleep stages with a Cohen’s kappa of 0.585 (95% confidence interval ±0.017); and 0.760 (±0.019) for discriminating awake vs. rapid eye movement vs. nonrapid eye movement sleep. Performance is better for younger ages, whereas it is robust for body mass index, apnea severity, and commonly used outpatient medications.
Conclusions
Our results validate that ECG and respiratory effort provide substantial information about sleep stages in a large heterogeneous population. This opens new possibilities in sleep research and applications where electroencephalography is not readily available or may be infeasible.
Journal Article
Clinical Prediction Models for Sleep Apnea: The Importance of Medical History over Symptoms
by
Ustun, Berk
,
Bianchi, Matt T.
,
Westover, M. Brandon
in
Adult
,
Approximation
,
Decision Support Techniques
2016
Study Objective:
Obstructive sleep apnea (OSA) is a treatable contributor to morbidity and mortality. However, most patients with OSA remain undiagnosed. We used a new machine learning method known as SLIM (Supersparse Linear Integer Models) to test the hypothesis that a diagnostic screening tool based on routinely available medical information would be superior to one based solely on patient-reported sleep-related symptoms.
Methods:
We analyzed polysomnography (PSG) and self-reported clinical information from 1,922 patients tested in our clinical sleep laboratory. We used SLIM and 7 state-of-the-art classification methods to produce predictive models for OSA screening using features from: (i) self-reported symptoms; (ii) self-reported medical information that could, in principle, be extracted from electronic health records (demographics, comorbidities), or (iii) both.
Results:
For diagnosing OSA, we found that model performance using only medical history features was superior to model performance using symptoms alone, and similar to model performance using all features. Performance was similar to that reported for other widely used tools: sensitivity 64.2% and specificity 77%. SLIM accuracy was similar to state-of-the-art classification models applied to this dataset, but with the benefit of full transparency, allowing for hands-on prediction using yes/no answers to a small number of clinical queries.
Conclusion:
To predict OSA, variables such as age, sex, BMI, and medical history are superior to the symptom variables we examined for predicting OSA. SLIM produces an actionable clinical tool that can be applied to data that is routinely available in modern electronic health records, which may facilitate automated, rather than manual, OSA screening.
Commentary:
A commentary on this article appears in this issue on page 159.
Citation:
Ustun B, Westover MB, Rudin C, Bianchi MT. Clinical prediction models for sleep apnea: the importance of medical history over symptoms.
J Clin Sleep Med
2016;12(2):161–168.
Journal Article
Sample Size Analysis for Machine Learning Clinical Validation Studies
by
Westover, M. Brandon
,
Goldenholz, Daniel M.
,
Ganglberger, Wolfgang
in
Accuracy
,
Algorithms
,
Bias
2023
Background: Before integrating new machine learning (ML) into clinical practice, algorithms must undergo validation. Validation studies require sample size estimates. Unlike hypothesis testing studies seeking a p-value, the goal of validating predictive models is obtaining estimates of model performance. There is no standard tool for determining sample size estimates for clinical validation studies for machine learning models. Methods: Our open-source method, Sample Size Analysis for Machine Learning (SSAML) was described and was tested in three previously published models: brain age to predict mortality (Cox Proportional Hazard), COVID hospitalization risk prediction (ordinal regression), and seizure risk forecasting (deep learning). Results: Minimum sample sizes were obtained in each dataset using standardized criteria. Discussion: SSAML provides a formal expectation of precision and accuracy at a desired confidence level. SSAML is open-source and agnostic to data type and ML model. It can be used for clinical validation studies of ML models.
Journal Article
Linking brain structure, cognition, and sleep: insights from clinical data
2024
Abstract
Study Objectives
To use relatively noisy routinely collected clinical data (brain magnetic resonance imaging (MRI) data, clinical polysomnography (PSG) recordings, and neuropsychological testing), to investigate hypothesis-driven and data-driven relationships between brain physiology, structure, and cognition.
Methods
We analyzed data from patients with clinical PSG, brain MRI, and neuropsychological evaluations. SynthSeg, a neural network-based tool, provided high-quality segmentations despite noise. A priori hypotheses explored associations between brain function (measured by PSG) and brain structure (measured by MRI). Associations with cognitive scores and dementia status were studied. An exploratory data-driven approach investigated age-structure-physiology-cognition links.
Results
Six hundred and twenty-three patients with sleep PSG and brain MRI data were included in this study; 160 with cognitive evaluations. Three hundred and forty-two participants (55%) were female, and age interquartile range was 52 to 69 years. Thirty-six individuals were diagnosed with dementia, 71 with mild cognitive impairment, and 326 with major depression. One hundred and fifteen individuals were evaluated for insomnia and 138 participants had an apnea–hypopnea index equal to or greater than 15. Total PSG delta power correlated positively with frontal lobe/thalamic volumes, and sleep spindle density with thalamic volume. rapid eye movement (REM) duration and amygdala volume were positively associated with cognition. Patients with dementia showed significant differences in five brain structure volumes. REM duration, spindle, and slow-oscillation features had strong associations with cognition and brain structure volumes. PSG and MRI features in combination predicted chronological age (R2 = 0.67) and cognition (R2 = 0.40).
Conclusions
Routine clinical data holds extended value in understanding and even clinically using brain-sleep-cognition relationships.
Graphical Abstract
Graphical Abstract
Journal Article
Association of Red Blood Cell Distribution Width With Mortality Risk in Hospitalized Adults With SARS-CoV-2 Infection
2020
Coronavirus disease 2019 (COVID-19) is an acute respiratory illness with a high rate of hospitalization and mortality. Biomarkers are urgently needed for patient risk stratification. Red blood cell distribution width (RDW), a component of complete blood counts that reflects cellular volume variation, has been shown to be associated with elevated risk for morbidity and mortality in a wide range of diseases.
To investigate whether an association between mortality risk and elevated RDW at hospital admission and during hospitalization exists in patients with COVID-19.
This cohort study included adults diagnosed with SARS-CoV-2 infection and admitted to 1 of 4 hospitals in the Boston, Massachusetts area (Massachusetts General Hospital, Brigham and Women's Hospital, North Shore Medical Center, and Newton-Wellesley Hospital) between March 4, 2020, and April 28, 2020.
The main outcome was patient survival during hospitalization. Measures included RDW at admission and during hospitalization, with an elevated RDW defined as greater than 14.5%. Relative risk (RR) of mortality was estimated by dividing the mortality of those with an elevated RDW by the mortality of those without an elevated RDW. Mortality hazard ratios (HRs) and 95% CIs were estimated using a Cox proportional hazards model.
A total of 1641 patients were included in the study (mean [SD] age, 62[18] years; 886 men [54%]; 740 White individuals [45%] and 497 Hispanic individuals [30%]; 276 nonsurvivors [17%]). Elevated RDW (>14.5%) was associated with an increased mortality risk in patients of all ages. The RR for the entire cohort was 2.73, with a mortality rate of 11% in patients with normal RDW (1173) and 31% in those with an elevated RDW (468). The RR in patients younger than 50 years was 5.25 (normal RDW, 1% [n = 341]; elevated RDW, 8% [n = 65]); 2.90 in the 50- to 59-year age group (normal RDW, 8% [n = 256]; elevated RDW, 24% [n = 63]); 3.96 in the 60- to 69-year age group (normal RDW, 8% [n = 226]; elevated RDW, 30% [104]); 1.45 in the 70- to 79-year age group (normal RDW, 23% [n = 182]; elevated RDW, 33% [n = 113]); and 1.59 in those ≥80 years (normal RDW, 29% [n = 168]; elevated RDW, 46% [n = 123]). RDW was associated with mortality risk in Cox proportional hazards models adjusted for age, D-dimer (dimerized plasmin fragment D) level, absolute lymphocyte count, and common comorbidities such as diabetes and hypertension (hazard ratio of 1.09 per 0.5% RDW increase and 2.01 for an RDW >14.5% vs ≤14.5%; P < .001). Patients whose RDW increased during hospitalization had higher mortality compared with those whose RDW did not change; for those with normal RDW, mortality increased from 6% to 24%, and for those with an elevated RDW at admission, mortality increased from 22% to 40%.
Elevated RDW at the time of hospital admission and an increase in RDW during hospitalization were associated with increased mortality risk for patients with COVID-19 who received treatment at 4 hospitals in a large academic medical center network.
Journal Article
Optimal spindle detection parameters for predicting cognitive performance
2022
Abstract
Study Objectives
Alterations in sleep spindles have been linked to cognitive impairment. This finding has contributed to a growing interest in identifying sleep-based biomarkers of cognition and neurodegeneration, including sleep spindles. However, flexibility surrounding spindle definitions and algorithm parameter settings present a methodological challenge. The aim of this study was to characterize how spindle detection parameter settings influence the association between spindle features and cognition and to identify parameters with the strongest association with cognition.
Methods
Adult patients (n = 167, 49 ± 18 years) completed the NIH Toolbox Cognition Battery after undergoing overnight diagnostic polysomnography recordings for suspected sleep disorders. We explored 1000 combinations across seven parameters in Luna, an open-source spindle detector, and used four features of detected spindles (amplitude, density, duration, and peak frequency) to fit linear multiple regression models to predict cognitive scores.
Results
Spindle features (amplitude, density, duration, and mean frequency) were associated with the ability to predict raw fluid cognition scores (r = 0.503) and age-adjusted fluid cognition scores (r = 0.315) with the best spindle parameters. Fast spindle features generally showed better performance relative to slow spindle features. Spindle features weakly predicted total cognition and poorly predicted crystallized cognition regardless of parameter settings.
Conclusions
Our exploration of spindle detection parameters identified optimal parameters for studies of fluid cognition and revealed the role of parameter interactions for both slow and fast spindles. Our findings support sleep spindles as a sleep-based biomarker of fluid cognition.
Journal Article