Catalogue Search | MBRL

Verification, analytical validation, and clinical validation (V3): the foundation of determining fit-for-purpose for Biometric Monitoring Technologies (BioMeTs)

by Wood, William A. , Gujar, Ninad , Bakker, Jessie P. in 706/648 , 706/648/697 , Biomedicine

2020

Digital medicine is an interdisciplinary field, drawing together stakeholders with expertize in engineering, manufacturing, clinical science, data science, biostatistics, regulatory science, ethics, patient advocacy, and healthcare policy, to name a few. Although this diversity is undoubtedly valuable, it can lead to confusion regarding terminology and best practices. There are many instances, as we detail in this paper, where a single term is used by different groups to mean different things, as well as cases where multiple terms are used to describe essentially the same concept. Our intent is to clarify core terminology and best practices for the evaluation of Biometric Monitoring Technologies (BioMeTs), without unnecessarily introducing new terms. We focus on the evaluation of BioMeTs as fit-for-purpose for use in clinical trials. However, our intent is for this framework to be instructional to all users of digital measurement tools, regardless of setting or intended use. We propose and describe a three-component framework intended to provide a foundational evaluation framework for BioMeTs. This framework includes (1) verification, (2) analytical validation, and (3) clinical validation. We aim for this common vocabulary to enable more effective communication and collaboration, generate a common and meaningful evidence base for BioMeTs, and improve the accessibility of the digital medicine field.

Journal Article

Share this book

Add to My Shelf

Defining the Digital Measurement of Scratching During Sleep or Nocturnal Scratching: Review of the Literature

by Ke Wang, Will , Cesnakova, Lucia , Dunn, Jessilyn in Atopic , Atopic dermatitis , Dermatitis

2023

Digital sensing solutions represent a convenient, objective, relatively inexpensive method that could be leveraged for assessing symptoms of various health conditions. Recent progress in the capabilities of digital sensing products has targeted the measurement of scratching during sleep, traditionally referred to as nocturnal scratching, in patients with atopic dermatitis or other skin conditions. Many solutions measuring nocturnal scratch have been developed; however, a lack of efforts toward standardization of the measure's definition and contextualization of scratching during sleep hampers the ability to compare different technologies for this purpose. We aimed to address this gap and bring forth unified measurement definitions for nocturnal scratch. We performed a narrative literature review of definitions of scratching in patients with skin inflammation and a targeted literature review of sleep in the context of the period during which such scratching occurred. Both searches were limited to English language studies in humans. The extracted data were synthesized into themes based on study characteristics: scratch as a behavior, other characterization of the scratching movement, and measurement parameters for both scratch and sleep. We then developed ontologies for the digital measurement of sleep scratching. In all, 29 studies defined inflammation-related scratching between 1996 and 2021. When cross-referenced with the results of search terms describing the sleep period, only 2 of these scratch-related papers also described sleep-related variables. From these search results, we developed an evidence-based and patient-centric definition of nocturnal scratch: an action of rhythmic and repetitive skin contact movement performed during a delimited time period of intended and actual sleep that is not restricted to any specific time of the day or night. Based on the measurement properties identified in the searches, we developed ontologies of relevant concepts that can be used as a starting point to develop standardized outcome measures of scratching during sleep in patients with inflammatory skin conditions. This work is intended to serve as a foundation for the future development of unified and well-described digital health technologies measuring nocturnal scratching and should enable better communication and sharing of results between various stakeholders taking part in research in atopic dermatitis and other inflammatory skin conditions.

Journal Article

Share this book

Add to My Shelf

EventDTW: An Improved Dynamic Time Warping Algorithm for Aligning Biomedical Signals of Nonuniform Sampling Frequencies

by Bent, Brinnae , Olgin, Jeffrey , Avram, Robert in Accuracy , Algorithms , Biomedical Technology

2020

The dynamic time warping (DTW) algorithm is widely used in pattern matching and sequence alignment tasks, including speech recognition and time series clustering. However, DTW algorithms perform poorly when aligning sequences of uneven sampling frequencies. This makes it difficult to apply DTW to practical problems, such as aligning signals that are recorded simultaneously by sensors with different, uneven, and dynamic sampling frequencies. As multi-modal sensing technologies become increasingly popular, it is necessary to develop methods for high quality alignment of such signals. Here we propose a DTW algorithm called EventDTW which uses information propagated from defined events as basis for path matching and hence sequence alignment. We have developed two metrics, the error rate (ER) and the singularity score (SS), to define and evaluate alignment quality and to enable comparison of performance across DTW algorithms. We demonstrate the utility of these metrics on 84 publicly-available signals in addition to our own multi-modal biomedical signals. EventDTW outperformed existing DTW algorithms for optimal alignment of signals with different sampling frequencies in 37% of artificial signal alignment tasks and 76% of real-world signal alignment tasks.

Journal Article

Share this book

Add to My Shelf

A Systematic Review of Time Series Classification Techniques Used in Biomedical Applications

by Kotla, Aditya , Shandhi, Md Mobashir Hasan , Yerrabelli, Rushil in Algorithms , Biomedical engineering , Classification

2022

Background: Digital clinical measures collected via various digital sensing technologies such as smartphones, smartwatches, wearables, and ingestible and implantable sensors are increasingly used by individuals and clinicians to capture the health outcomes or behavioral and physiological characteristics of individuals. Time series classification (TSC) is very commonly used for modeling digital clinical measures. While deep learning models for TSC are very common and powerful, there exist some fundamental challenges. This review presents the non-deep learning models that are commonly used for time series classification in biomedical applications that can achieve high performance. Objective: We performed a systematic review to characterize the techniques that are used in time series classification of digital clinical measures throughout all the stages of data processing and model building. Methods: We conducted a literature search on PubMed, as well as the Institute of Electrical and Electronics Engineers (IEEE), Web of Science, and SCOPUS databases using a range of search terms to retrieve peer-reviewed articles that report on the academic research about digital clinical measures from a five-year period between June 2016 and June 2021. We identified and categorized the research studies based on the types of classification algorithms and sensor input types. Results: We found 452 papers in total from four different databases: PubMed, IEEE, Web of Science Database, and SCOPUS. After removing duplicates and irrelevant papers, 135 articles remained for detailed review and data extraction. Among these, engineered features using time series methods that were subsequently fed into widely used machine learning classifiers were the most commonly used technique, and also most frequently achieved the best performance metrics (77 out of 135 articles). Statistical modeling (24 out of 135 articles) algorithms were the second most common and also the second-best classification technique. Conclusions: In this review paper, summaries of the time series classification models and interpretation methods for biomedical applications are summarized and categorized. While high time series classification performance has been achieved in digital clinical, physiological, or biomedical measures, no standard benchmark datasets, modeling methods, or reporting methodology exist. There is no single widely used method for time series model development or feature interpretation, however many different methods have proven successful.

Journal Article

Share this book

Add to My Shelf

Recent Academic Research on Clinically Relevant Digital Measures: Systematic Review

by Shandhi, Md Mobashir Hasan , Chung, Jeanne , Wang, Will Ke in Adoption of innovations , Anatomical systems , Behavioral economics

2021

Digital clinical measures collected via various digital sensing technologies such as smartphones, smartwatches, wearables, ingestibles, and implantables are increasingly used by individuals and clinicians to capture health outcomes or behavioral and physiological characteristics of individuals. Although academia is taking an active role in evaluating digital sensing products, academic contributions to advancing the safe, effective, ethical, and equitable use of digital clinical measures are poorly characterized. We performed a systematic review to characterize the nature of academic research on digital clinical measures and to compare and contrast the types of sensors used and the sources of funding support for specific subareas of this research. We conducted a PubMed search using a range of search terms to retrieve peer-reviewed articles reporting US-led academic research on digital clinical measures between January 2019 and February 2021. We screened each publication against specific inclusion and exclusion criteria. We then identified and categorized research studies based on the types of academic research, sensors used, and funding sources. Finally, we compared and contrasted the funding support for these specific subareas of research and sensor types. The search retrieved 4240 articles of interest. Following the screening, 295 articles remained for data extraction and categorization. The top five research subareas included operations research (research analysis; n=225, 76%), analytical validation (n=173, 59%), usability and utility (data visualization; n=123, 42%), verification (n=93, 32%), and clinical validation (n=83, 28%). The three most underrepresented areas of research into digital clinical measures were ethics (n=0, 0%), security (n=1, 0.5%), and data rights and governance (n=1, 0.5%). Movement and activity trackers were the most commonly studied sensor type, and physiological (mechanical) sensors were the least frequently studied. We found that government agencies are providing the most funding for research on digital clinical measures (n=192, 65%), followed by independent foundations (n=109, 37%) and industries (n=56, 19%), with the remaining 12% (n=36) of these studies completely unfunded. Specific subareas of academic research related to digital clinical measures are not keeping pace with the rapid expansion and adoption of digital sensing products. An integrated and coordinated effort is required across academia, academic partners, and academic funders to establish the field of digital clinical measures as an evidence-based field worthy of our trust.

Journal Article

Share this book

Add to My Shelf

Performance of Wearable Pulse Oximetry During Controlled Hypoxia Induction: Instrument Validation Study

by Roghanizad, Ali R , Bhosai, Satasuk Joy , MacLeod, David in Adult , Female , Formative Evaluation of Digital Health Interventions

2026

Oxygen saturation is a crucial metric used for monitoring patients with lung disease or respiratory illness who are at risk of hypoxemia (low blood oxygen saturation). Early and accurate identification of abnormal oxygen saturation is important for these patients, who may develop significant desaturation and hypoxemia symptoms during their daily activities. This study aimed to evaluate the accuracy of Apple Watch Series 7 and a clinical-grade pulse oximeter, Masimo MightySat Rx, under hypoxemia and to assess whether measurement error is influenced by the oxygen desaturation rate (ODR). We calculated the ODR of each measurement and conducted a comparative analysis of the displayed oxygen saturation readings from both the Masimo MightySat Rx finger pulse oximeter and Apple Watch Series 7 with arterial blood oxygen saturation (SaO2) readings obtained from a blood gas analyzer. Both the Masimo MightySat Rx pulse oximeter and the Apple Watch Series 7 tended to overestimate oxygen saturation. The pulse oximeter readings were more likely to fall within 2% of the acceptable (as specified by Masimo) peripheral oxygen saturation (SpO₂) error range than the Apple Watch (49.03% vs 32.14%). Notably, both devices had limitations under low oxygen saturation levels (<88%), with an accuracy root mean square difference (Arms) of 3.52% (95% CI 3.18%-3.86%) and 5.82% (95% CI 5.32%-6.31%) for the Masimo MightySat Rx and Apple Watch Series 7, respectively. Among the blood oxygen measurements taken during a high ODR (ie, ≥2% SpO2 per minute), which is a rate clinically correlated with sleep apnea, the Arms increased slightly by 0.75% for the Masimo MightySat Rx and decreased by 0.28% for the Apple Watch Series 7. Both devices consistently overestimated SpO2, with accuracy declining notably during hypoxemia. The Apple Watch Series 7 mean bias suggests a likelihood of missing instances of hypoxemia, particularly at arterial oxygen saturation values below but close to 88%. Both the Apple Watch Series 7 and Masimo MightySat Rx exhibited Arms values exceeding the US Food and Drug Administration threshold under conditions of hypoxemia. While past studies have implicated high ODRs in increasing measurement error, we found no statistically significant relationship between ODR and measurement error for either device. Overall, our findings of SpO2 overestimation and high Arms values underscore the need for caution when interpreting oxygen saturation values from these devices. The small sample size and limited diversity in skin tone and age restrict the generalizability of our findings. Future studies should include larger and more diverse populations to evaluate the performance of wearable-based pulse oximetry.

Journal Article

Share this book

Add to My Shelf

Impact of daily caffeine intake and timing on electroencephalogram-measured sleep in adolescents

by Kollins, Scott H. , Wang, Ke Will , Engelhard, Matthew M. in Adolescent , Caffeine , Caffeine - adverse effects

2022

Study Objectives: Caffeine use is ubiquitous among adolescents and may be harmful to sleep, with downstream implications for health and development. Research has been limited by self-reported and/or aggregated measures of sleep and caffeine collected at a single time point. This study examines bidirectional associations between daily caffeine consumption and electroencephalogram-measured sleep among adolescents and explores whether these relationships depend on timing of caffeine use. Methods: Ninety-eight adolescents aged 11–17 (mean =14.38, standard deviation = 1.77; 50% female) participated in 7 consecutive nights of at-home sleep electroencephalography and completed a daily diary querying morning, afternoon, and evening caffeine use. Linear mixed-effects regressions examined relationships between caffeine consumption and total sleep time, sleep-onset latency, sleep efficiency, wake after sleep onset, and time spent in sleep stages. Impact of sleep indices on next-day caffeine use was also examined. Results: Increased total caffeine consumption was associated was increased sleep-onset latency ( β = .13; 95% CI = .06, .21; P < .001) and reduced total sleep time ( β = −.17; 95% confidence interval [CI] = −.31, −.02; P = .02), sleep efficiency ( β = −1.59; 95% CI = −2.51, −.67; P < .001), and rapid eye movement sleep ( β = −.12; 95% CI = −.19, −.05; P < .001). Findings were driven by afternoon and evening caffeine consumption. Reduced sleep efficiency was associated with increased afternoon caffeine intake the following day ( β = −.006; 95% CI = −.012, −.001; P = .01). Conclusions: Caffeine consumption, especially afternoon and evening use, impacts several aspects of adolescent sleep health. In contrast, most sleep indicators did not affect next-day caffeine use, suggesting multiple drivers of adolescent caffeine consumption. Federal mandates requiring caffeine content labeling and behavioral interventions focused on reducing caffeine intake may support adolescent sleep health. Citation: Lunsford-Avery JR, Kollins SH, Kansagra S, Wang KW, Engelhard MM. Impact of daily caffeine intake and timing on electroencephalogram-measured sleep in adolescents. J Clin Sleep Med . 2022;18(3):877–884.

Journal Article

Share this book

Add to My Shelf

Sleep Health and Wearable Technology: Algorithmic Development Towards Field-Based Sleep Monitoring

by Wang, Will Ke in Artificial intelligence , Bioinformatics , Biomedical engineering

2024

Wearable devices have rapidly become essential tools for tracking sleep in natural, nonclinical settings. Despite their widespread adoption, consumer-grade wearables, such as smartwatches and fitness trackers, exhibit significant limitations in their ability to accurately track wake epochs after sleep onset and classify sleep stages, particularly in individuals with sleep disorders. Recent independent validation studies reported frequent misclassifications, especially distinguishing Rapid Eye Movement (REM) sleep and non-REM (NREM) N3 from other sleep stages. This inaccuracy is exacerbated in populations with sleep disorders such as obstructive sleep apnea (OSA), where sleep is fragmented and physiological signals and sleep patterns deviate from those seen in healthy individuals. Furthermore, it is practically impossible for independent academic researchers to develop and evaluate sleep staging algorithms from consumer devices due to their proprietary nature and a lack of publicly available datasets for wearable sleep staging. The need for precise, reliable, and scalable sleep tracking methods in wearable devices is crucial as wearables become more integrated into both personal health management and clinical applications.To address these challenges, I collected and published the Dataset for Real-time sleep stage EstimAtion using Multisensor wearable Technology (DREAMT), a unique collection of multimodal physiological data recorded from 100 participants diagnosed with varying sleep disorders at the Duke Sleep Disorders Center. The DREAMT dataset includes synchronized recordings of both wearable device data, obtained using Empatica E4 smartwatches, and sleep stage annotations and sleep apnea events annotated by certified sleep technicians based on clinical polysomnography (PSG), the gold standard for clinical sleep studies and sleep staging. This dataset is the first and only high-resolution wearable smartwatch dataset with reliable sleep stages made public. It is an indispensable resource for advancing the development and validation of sleep staging algorithms capable of accurately detecting sleep stages using wearable smartwatches by serving as a benchmark for the research community to develop and compare new algorithmic development. It represents an important step towards establishing an open science framework for wearable-based sleep research.Leveraging this dataset, I proposed a modeling approach to predict sleep vs. wake, combining feature engineering, Light Gradient Boosting Machines (LightGBM) with Gaussian Process-based mixed effects modeling (GPBoost) for epoch-by-epoch sleep vs. wake prediction, and a Long Short-Term Memory (LSTM) network for post-processing. The LSTM module is an innovative approach to improve sleep vs. wake detection by capturing the temporal dependencies within these physiological signals. This feature engineering process significantly enhanced the model’s ability to detect transitions between wakefulness and sleep, especially in cases of individuals with sleep disorders by recognizing that individuals of varying degrees of sleep disorder severity are very likely to exhibit different sleep patterns and behaviors. This ensemble model established a baseline and also provided a foundation for exploring more sophisticated deep learning architectures tailored to wearable sleep data.Building on this foundation and to utilize the existing large external PSG datasets, I designed WatchSleepNet to predict wake vs. NREM vs. REM, a deep learning model specifically developed to tackle the inherent challenges of wearable-based sleep staging. WatchSleepNet integrates Convolutional Neural Networks (CNNs), Temporal Convolutional Networks (TCNs), and bidirectional LSTM networks with multi-head attention mechanisms to process Inter-beat Interval (IBI) signals for sequence-to-sequence classification. The model was trained to recognize both spatial and temporal dependencies in the physiological data, enabling it to accurately classify sleep stages in terms of wake, NREM, and REM. One of the unique strengths of WatchSleepNet is the approach of pretraining using IBI values calculated from both ECG and PPG signals available in large external PSG datasets, including the Sleep Heart Health Study (SHHS) and the Multi-Ethnic Study of Atherosclerosis (MESA). This pretraining step allowed the model to learn foundational patterns in sleep physiology across diverse populations and levels of sleep disorder severity. Following pretraining, the model was fine-tuned using the DREAMT dataset, ensuring that it could adapt to the unique characteristics of wrist-based PPG data collected in real-world settings. WatchSleepNet demonstrated superior performance compared to state-of-the-art models like SleepConvNet and InsightSleepNet in a similar training pipeline, achieving significant improvements in REM sleep classification, an area where consumer-grade wearables typically perform poorly. The model achieved a REM F1-score of 0.648 and a Cohen’s Kappa of 0.706, significantly higher than the results from the benchmark algorithms, highlighting its potential to bridge the gap between consumer and gold-standard in-clinic sleep tracking.Beyond model development, this dissertation contributes to the broader field of digital health, particularly in promoting open science and the standardization of wearable sensor data for sleep research. By publishing the DREAMT dataset and developing reproducible methodologies for digital sleep biomarkers, this work sets the stage for more transparent, collaborative research in wearable-based sleep tracking. Additionally, I explored the clinical utility of wearable sleep monitoring in detecting circadian rhythm disruptions and their impacts on mental health in adolescents. Circadian misalignments, common among shift workers and individuals with sleep disorders, are linked to an increased risk of mood disorders, anxiety, and metabolic dysfunction. By advancing the accuracy and reliability of wearable devices in tracking these disruptions, this research opens new avenues for early detection, intervention, and management of these conditions.

Dissertation

Share this book

Add to My Shelf

Tree-based classification model for Long-COVID infection prediction with age stratification using data from the National COVID Cohort Collaborative

by Singh, Karnika , Cho, Peter , Roghanizad, Ali R in Classification , Collaboration , Decision-making

2024

Objectives We propose and validate a domain knowledge-driven classification model for diagnosing post-acute sequelae of SARS-CoV-2 infection (PASC), also known as Long COVID, using Electronic Health Records (EHRs) data. Materials and Methods We developed a robust model that incorporates features strongly indicative of PASC or associated with the severity of COVID-19 symptoms as identified in our literature review. The XGBoost tree-based architecture was chosen for its ability to handle class-imbalanced data and its potential for high interpretability. Using the training data provided by the Long COVID Computation Challenge (L3C), which was a sample of the National COVID Cohort Collaborative (N3C), our models were fine-tuned and calibrated to optimize Area Under the Receiver Operating characteristic curve (AUROC) and the F1 score, following best practices for the class-imbalanced N3C data. Results Our age-stratified classification model demonstrated strong performance with an average 5-fold cross-validated AUROC of 0.844 and F1 score of 0.539 across the young adult, mid-aged, and older-aged populations in the training data. In an independent testing dataset, which was made available after the challenge was over, we achieved an overall AUROC score of 0.814 and F1 score of 0.545. Discussion The results demonstrated the utility of knowledge-driven feature engineering in a sparse EHR data and demographic stratification in model development to diagnose a complex and heterogeneously presenting condition like PASC. The model’s architecture, mirroring natural clinician decision-making processes, contributed to its robustness and interpretability, which are crucial for clinical translatability. Further, the model’s generalizability was evaluated over a new cross-sectional data as provided in the later stages of the L3C challenge. Conclusion The study proposed and validated the effectiveness of age-stratified, tree-based classification models to diagnose PASC. Our approach highlights the potential of machine learning in addressing the diagnostic challenges posed by the heterogeneity of Long-COVID symptoms. Lay Summary Post-acute sequelae of SARS-CoV-2 infection (PASC), also called Long COVID, refers to a range of symptoms that continue for weeks or months after recovering from the initial COVID-19 infection. While some people recover fully, others experience persistent issues like fatigue, difficulty breathing, coughing, and memory impairment, which can severely affect their daily lives. In this study, we developed a machine learning model to help health care providers diagnose Long COVID more effectively using retrospective electronic health records (EHRs). The model is designed to be interpretable, providing insights to what the important features are and how the model reaches its conclusions. Importantly, the model is designed to account for the differences in how PASC manifests in various age groups, ensuring reliable diagnosis and care for patients across all age groups.

Journal Article

Share this book

Add to My Shelf

Verification, analytical validation, and clinical validation (V3): the foundation of determining fit-for-purpose for Biometric Monitoring Technologies (BioMeTs)

by Gujar, Ninad , Peterson, Barry , Bakker, Jessie P

2020

Digital medicine is an interdisciplinary field, drawing together stakeholders with expertize in engineering, manufacturing, clinical science, data science, biostatistics, regulatory science, ethics, patient advocacy, and healthcare policy, to name a few. Although this diversity is undoubtedly valuable, it can lead to confusion regarding terminology and best practices. There are many instances, as we detail in this paper, where a single term is used by different groups to mean different things, as well as cases where multiple terms are used to describe essentially the same concept. Our intent is to clarify core terminology and best practices for the evaluation of Biometric Monitoring Technologies (BioMeTs), without unnecessarily introducing new terms. We focus on the evaluation of BioMeTs as fit-for-purpose for use in clinical trials. However, our intent is for this framework to be instructional to all users of digital measurement tools, regardless of setting or intended use. We propose and describe a three-component framework intended to provide a foundational evaluation framework for BioMeTs. This framework includes (1) verification, (2) analytical validation, and (3) clinical validation. We aim for this common vocabulary to enable more effective communication and collaboration, generate a common and meaningful evidence base for BioMeTs, and improve the accessibility of the digital medicine field.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter