Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
972 result(s) for "Data harmonization"
Sort by:
The quest for seafloor macrolitter: a critical review of background knowledge, current methods and future prospects
The seafloor covers some 70% of the Earth’s surface and has been recognised as a major sink for marine litter. Still, litter on the seafloor is the least investigated fraction of marine litter, which is not surprising as most of it lies in the deep sea, i.e. the least explored ecosystem. Although marine litter is considered a major threat for the oceans, monitoring frameworks are still being set up. This paper reviews current knowledge and methods, identifies existing needs, and points to future developments that are required to address the estimation of seafloor macrolitter. It provides background knowledge and conveys the views and thoughts of scientific experts on seafloor marine litter offering a review of monitoring and ocean modelling techniques. Knowledge gaps that need to be tackled, data needs for modelling, and data comparability and harmonisation are also discussed. In addition, it shows how research on seafloor macrolitter can inform international protection and conservation frameworks to prioritise efforts and measures against marine litter and its deleterious impacts.
The application of data harmonization in minority studies. The case of the political participation of the Russian speaking population in former soviet states
This article discusses the theoretical possibilities and practical implications of survey data recycling and survey data harmonization. Using the example of political participation (participation in demonstration) of the Russian-speaking population in former Soviet states, the article presents the procedure of key variable harmonization (minority status), the rules, and the procedures of creating a harmonization control variable, and the possibilities of using harmonized variables in substantive statistical analysis. The harmonization procedures described in this article can be used to study other rare events and other minority groups – studies that often struggle with small and insufficient samples.
How European Research Projects Can Support Vaccination Strategies: The Case of the ORCHESTRA Project for SARS-CoV-2
ORCHESTRA (“Connecting European Cohorts to Increase Common and Effective Response To SARS-CoV-2 Pandemic”) is an EU-funded project which aims to help rapidly advance the knowledge related to the prevention of the SARS-CoV-2 infection and the management of COVID-19 and its long-term sequelae. Here, we describe the early results of this project, focusing on the strengths of multiple, international, historical and prospective cohort studies and highlighting those results which are of potential relevance for vaccination strategies, such as the necessity of a vaccine booster dose after a primary vaccination course in hematologic cancer patients and in solid organ transplant recipients to elicit a higher antibody titer, and the protective effect of vaccination on severe COVID-19 clinical manifestation and on the emergence of post-COVID-19 conditions. Valuable data regarding epidemiological variations, risk factors of SARS-CoV-2 infection and its sequelae, and vaccination efficacy in different subpopulations can support further defining public health vaccination policies.
Breaking Digital Health Barriers Through a Large Language Model–Based Tool for Automated Observational Medical Outcomes Partnership Mapping: Development and Validation Study
The integration of diverse clinical data sources requires standardization through models such as Observational Medical Outcomes Partnership (OMOP). However, mapping data elements to OMOP concepts demands significant technical expertise and time. While large health care systems often have resources for OMOP conversion, smaller clinical trials and studies frequently lack such support, leaving valuable research data siloed. This study aims to develop and validate a user-friendly tool that leverages large language models to automate the OMOP conversion process for clinical trials, electronic health records, and registry data. We developed a 3-tiered semantic matching system using GPT-3 embeddings to transform heterogeneous clinical data to the OMOP Common Data Model. The system processes input terms by generating vector embeddings, computing cosine similarity against precomputed Observational Health Data Sciences and Informatics vocabulary embeddings, and ranking potential matches. We validated the system using two independent datasets: (1) a development set of 76 National Institutes of Health Helping to End Addiction Long-term Initiative clinical trial common data elements for chronic pain and opioid use disorders and (2) a separate validation set of electronic health record concepts from the National Institutes of Health National COVID Cohort Collaborative COVID-19 enclave. The architecture combines Unified Medical Language System semantic frameworks with asynchronous processing for efficient concept mapping, made available through an open-source implementation. The system achieved an area under the receiver operating characteristic curve of 0.9975 for mapping clinical trial common data element terms. Precision ranged from 0.92 to 0.99 and recall ranged from 0.88 to 0.97 across similarity thresholds from 0.85 to 1.0. In practical application, the tool successfully automated mappings that previously required manual informatics expertise, reducing the technical barriers for research teams to participate in large-scale, data-sharing initiatives. Representative mappings demonstrated high accuracy, such as demographic terms achieving 100% similarity with corresponding Logical Observation Identifiers Names and Codes concepts. The implementation successfully processes diverse data types through both individual term mapping and batch processing capabilities. Our validated large language model-based tool effectively automates the transformation of clinical data into the OMOP format while maintaining high accuracy. The combination of semantic matching capabilities and a researcher-friendly interface makes data harmonization accessible to smaller research teams without requiring extensive informatics support. This has direct implications for accelerating clinical research data standardization and enabling broader participation in initiatives such as the National Institutes of Health Helping to End Addiction Long-term Initiative Data Ecosystem.
Estimating prevalence of subjective cognitive decline in and across international cohort studies of aging: a COSMIC study
Background Subjective cognitive decline (SCD) is recognized as a risk stage for Alzheimer’s disease (AD) and other dementias, but its prevalence is not well known. We aimed to use uniform criteria to better estimate SCD prevalence across international cohorts. Methods We combined individual participant data for 16 cohorts from 15 countries (members of the COSMIC consortium) and used qualitative and quantitative (Item Response Theory/IRT) harmonization techniques to estimate SCD prevalence. Results The sample comprised 39,387 cognitively unimpaired individuals above age 60. The prevalence of SCD across studies was around one quarter with both qualitative harmonization/QH (23.8%, 95%CI = 23.3–24.4%) and IRT (25.6%, 95%CI = 25.1–26.1%); however, prevalence estimates varied largely between studies (QH 6.1%, 95%CI = 5.1–7.0%, to 52.7%, 95%CI = 47.4–58.0%; IRT: 7.8%, 95%CI = 6.8–8.9%, to 52.7%, 95%CI = 47.4–58.0%). Across studies, SCD prevalence was higher in men than women, in lower levels of education, in Asian and Black African people compared to White people, in lower- and middle-income countries compared to high-income countries, and in studies conducted in later decades. Conclusions SCD is frequent in old age. Having a quarter of older individuals with SCD warrants further investigation of its significance, as a risk stage for AD and other dementias, and of ways to help individuals with SCD who seek medical advice. Moreover, a standardized instrument to measure SCD is needed to overcome the measurement variability currently dominant in the field.
Informing Harmonization Decisions in Integrative Data Analysis: Exploring the Measurement Multiverse
Combining datasets in an integrative data analysis (IDA) requires researchers to make a number of decisions about how best to harmonize item responses across datasets. This entails two sets of steps: logical harmonization, which involves combining items which appear similar across datasets, and analytic harmonization, which involves using psychometric models to find and account for cross-study differences in measurement. Embedded in logical and analytic harmonization are many decisions, from deciding whether items can be combined prima facie to how best to find covariate effects on specific items. Researchers may not have specific hypotheses about these decisions, and each individual choice may seem arbitrary, but the cumulative effects of these decisions are unknown. In the current study, we conducted an IDA of the relationship between alcohol use and delinquency using three datasets (total N = 2245). For analytic harmonization, we used moderated nonlinear factor analysis (MNLFA) to generate factor scores for delinquency. We conducted both logical and analytic harmonization 72 times, each time making a different set of decisions. We assessed the cumulative influence of these decisions on MNLFA parameter estimates, factor scores, and estimates of the relationship between delinquency and alcohol use. There were differences across paths in MNLFA parameter estimates, but fewer differences in estimates of factor scores and regression parameters linking delinquency to alcohol use. These results suggest that factor scores may be relatively robust to subtly different decisions in data harmonization, and measurement model parameters are less so.
A comparison of methods to harmonize cortical thickness measurements across scanners and sites
Results of neuroimaging datasets aggregated from multiple sites may be biased by site-specific profiles in participants’ demographic and clinical characteristics, as well as MRI acquisition protocols and scanning platforms. We compared the impact of four different harmonization methods on results obtained from analyses of cortical thickness data: (1) linear mixed-effects model (LME) that models site-specific random intercepts (LMEINT), (2) LME that models both site-specific random intercepts and age-related random slopes (LMEINT+SLP), (3) ComBat, and (4) ComBat with a generalized additive model (ComBat-GAM). Our test case for comparing harmonization methods was cortical thickness data aggregated from 29 sites, which included 1,340 cases with posttraumatic stress disorder (PTSD) (6.2–81.8 years old) and 2,057 trauma-exposed controls without PTSD (6.3–85.2 years old). We found that, compared to the other data harmonization methods, data processed with ComBat-GAM was more sensitive to the detection of significant case-control differences (Χ2(3) = 63.704, p < 0.001) as well as case-control differences in age-related cortical thinning (Χ2(3) = 12.082, p = 0.007). Both ComBat and ComBat-GAM outperformed LME methods in detecting sex differences (Χ2(3) = 9.114, p = 0.028) in regional cortical thickness. ComBat-GAM also led to stronger estimates of age-related declines in cortical thickness (corrected p-values < 0.001), stronger estimates of case-related cortical thickness reduction (corrected p-values < 0.001), weaker estimates of age-related declines in cortical thickness in cases than controls (corrected p-values < 0.001), stronger estimates of cortical thickness reduction in females than males (corrected p-values < 0.001), and stronger estimates of cortical thickness reduction in females relative to males in cases than controls (corrected p-values < 0.001). Our results support the use of ComBat-GAM to minimize confounds and increase statistical power when harmonizing data with non-linear effects, and the use of either ComBat or ComBat-GAM for harmonizing data with linear effects.
Conceptual design of a generic data harmonization process for OMOP common data model
Background To gain insight into the real-life care of patients in the healthcare system, data from hospital information systems and insurance systems are required. Consequently, linking clinical data with claims data is necessary. To ensure their syntactic and semantic interoperability, the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) from the Observational Health Data Sciences and Informatics (OHDSI) community was chosen. However, there is no detailed guide that would allow researchers to follow a generic process for data harmonization, i.e. the transformation of local source data into the standardized OMOP CDM format. Thus, the aim of this paper is to conceptualize a generic data harmonization process for OMOP CDM. Methods For this purpose, we conducted a literature review focusing on publications that address the harmonization of clinical or claims data in OMOP CDM. Subsequently, the process steps used and their chronological order as well as applied OHDSI tools were extracted for each included publication. The results were then compared to derive a generic sequence of the process steps. Results From 23 publications included, a generic data harmonization process for OMOP CDM was conceptualized, consisting of nine process steps: dataset specification, data profiling, vocabulary identification, coverage analysis of vocabularies, semantic mapping, structural mapping, extract-transform-load-process, qualitative and quantitative data quality analysis. Furthermore, we identified seven OHDSI tools which supported five of the process steps. Conclusions The generic data harmonization process can be used as a step-by-step guide to assist other researchers in harmonizing source data in OMOP CDM.
Long time series (1984–2020) of albedo variations on the Greenland ice sheet from harmonized Landsat and Sentinel 2 imagery
Albedo is a key factor in modulating the absorption of solar radiation on ice surfaces. Satellite measurements have shown a general reduction in albedo across the Greenland ice sheet over the past few decades, particularly along the western margin of the ice sheet, a region known as the Dark Zone (albedo < 0.45). Here we chose a combination of Landsat 4–8 and Sentinel 2 imagery to enable us to derive the longest record of albedo variations in the Dark Zone, running from 1984 to 2020. We developed a simple, pragmatic and efficient sensor transformation to provide a long time series of consistent, harmonized satellite imagery. Narrow to broadband conversion algorithms were developed from regression models of harmonized satellite data and in situ albedo from the Program for Monitoring of the Greenland Ice Sheet (PROMICE) automatic weather stations. The albedo derived from the harmonized Landsat and Sentinel 2 data shows that the maximum extent of the Dark Zone expanded rapidly between 2005 and 2007, increasing to ~280% of the average annual maximum extent of 2900 km2 to ~8000 km2 since. The Dark Zone is continuing to darken slowly, with the average annual minimum albedo decreasing at a rate of $\\sim \\!-0.0006 \\pm 0.0004 \\, {\\rm a}^{-1}$ (p = 0.16, 2001–2020).
Pioneering a multi-phase framework to harmonize self-reported sleep data across cohorts
Abstract Study Objectives Harmonizing and aggregating data across studies enables pooled analyses that support external validation and enhance replicability and generalizability. However, the multidimensional nature of sleep poses challenges for data harmonization and aggregation. Here we describe and implement our process for harmonizing self-reported sleep data. Methods We established a multi-phase framework to harmonize self-reported sleep data: (1) compile items, (2) group items into domains, (3) harmonize items, and (4) evaluate harmonizability. We applied this process to produce a pooled multi-cohort sample of five US cohorts plus a separate yet fully harmonized sample from Rotterdam, Netherlands. Sleep and sociodemographic data are described and compared to demonstrate the utility of harmonization and aggregation. Results We collected 190 unique self-reported sleep items and grouped them into 15 conceptual domains. Using these domains as guiderails, we developed 14 harmonized items measuring aspects of satisfaction, alertness/sleepiness, timing, efficiency, duration, insomnia, and sleep apnea. External raters determined that 13 of these 14 items had moderate-to-high harmonizability. Alertness/Sleepiness items had lower harmonizability, while continuous, quantitative items (e.g. timing, total sleep time, and efficiency) had higher harmonizability. Descriptive statistics identified features that are more consistent (e.g. wake-up time and duration) and more heterogeneous (e.g. time in bed and bedtime) across samples. Conclusions Our process can guide researchers and cohort stewards toward effective sleep harmonization and provide a foundation for further methodological development in this expanding field. Broader national and international initiatives promoting common data elements across cohorts are needed to enhance future harmonization and aggregation efforts.