Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
11
result(s) for
"Multi‐Center Validation"
Sort by:
Machine Learning Predicts Risk of Falls in Parkison's Disease Patients in a Multicenter Observational Study
by
Garbarino, Sara
,
Piana, Michele
,
Castellano, Antonella
in
Accidental Falls - prevention & control
,
Accidental Falls - statistics & numerical data
,
Aged
2025
Background Postural instability and gait difficulties are key symptoms of Parkinson's disease (PD), elevating the risk of falls substantially. Falls afflict 35% to 90% of PD patients, representing a major challenge in managing the condition. Accurate prediction of fall risk and identification of contributing factors are essential for timely interventions. Objectives Our objective was to develop and validate a machine learning (ML) algorithm across multiple centers in Italy to accurately forecast fall risk and identify related factors using routinely collected clinical data. Methods Patient data from two Italian centers (N = 251) were divided into a training cohort (N = 164) for ML model development and a validation cohort (N = 87). External validation was conducted on a subset of PPMI study patients (N = 65). We compared the performance of logistic regression (LR) and Support Vector Classifier (SVC) models trained on clinical data. The Shapley Additive exPlanations (SHAP) method was employed to examine the predictive power of individual variables. Results In the training set, SVC outperformed LR slightly (AUC: LR = 0.779 ± 0.054, SVC = 0.792 ± 0.056). However, LR demonstrated better prediction accuracy in both internal (AUC: LR = 0.753, SVC = 0.733) and external validation cohorts (AUC: LR = 0.714, SVC = 0.676). SHAP analysis on the LR model revealed associations between fall risk and both motor and non‐motor variables. Conclusions ML‐based models effectively estimate fall risk across different clinical centers, enabling tailored interventions to enhance PD patients' quality of life. Challenges persist in predicting falls in US‐based patients due to demographic and healthcare system differences.
Journal Article
Development and validation of a multimodal automatic interictal epileptiform discharge detection model: a prospective multi-center study
2025
Background
Visual identification of interictal epileptiform discharge (IED) is expert-biased and time-consuming. Accurate automated IED detection models can facilitate epilepsy diagnosis. This study aims to develop a multimodal IED detection model (vEpiNetV2) and conduct a multi-center validation.
Methods
We constructed a large training dataset to train vEpiNetV2, which comprises 26,706 IEDs and 194,797 non-IED 4-s video-EEG epochs from 530 patients at Peking Union Medical College Hospital (PUMCH). The automated IED detection model was constructed using deep learning based on video and electroencephalogram (EEG) features. We proposed a bad channel removal model and patient detection method to improve the robustness of vEpiNetV2 for multi-center validation. Performance is verified in a prospective multi-center test dataset, with area under the precision-recall curve (AUPRC) and area under the curve (AUC) as metrics.
Results
To fairly evaluate the model performance, we constructed a large test dataset containing 149 patients, 377 h video-EEG data, and 9232 IEDs from PUMCH, Children’s Hospital Affiliated to Shandong University (SDQLCH) and Beijing Tiantan Hospital (BJTTH). Amplitude discrepancies are observed across centers and could be classified by a classifier. vEpiNetV2 demonstrated favorable accuracy for the IED detection, achieving AUPRC/AUC values of 0.76/0.98 (PUMCH), 0.78/0.96 (SDQLCH), and 0.76/0.98 (BJTTH), with false positive rates of 0.16–0.31 per minute at 80% sensitivity. Incorporating video features improves precision by 9%, 7%, and 5% at three centers, respectively. At 95% sensitivity, video features eliminated 24% false positives in the whole test dataset. While bad channels decreased model precision, video features compensate for this deficiency. Accurate patient detection is essential; otherwise, incorrect patient detection can negatively impact overall performance.
Conclusions
The multimodal IED detection model, which integrates video and EEG features, demonstrated high precision and robustness. The large multi-center validation confirmed its potential for real-world clinical application and the value of video features in IED analysis.
Journal Article
Electronic-Medical-Record-Driven Machine Learning Predictive Model for Hospital-Acquired Pressure Injuries: Development and External Validation
by
Kia, Arash
,
Reich, David L.
,
Sevillano, Maria
in
Accuracy
,
Artificial intelligence
,
Automation
2025
Background: Hospital-acquired pressure injuries (HAPIs) affect approximately 2.5 million patients annually in the United States, leading to increased morbidity and healthcare costs. Current rule-based screening tools, such as the Braden Scale, lack sensitivity, highlighting the need for improved risk prediction methods. Methods: We developed and externally validated a machine learning model to predict HAPI risk using longitudinal electronic medical record (EMR) data. This study included adult inpatients (2018–2023) across five hospitals within a large health system. An automated pipeline was built for EMR data curation, labeling, and integration. The model employed XGBoost with recursive feature elimination to identify 35 optimal clinical variables and utilized time-series analysis for dynamic risk prediction. Results: Internal validation and multi-center external validation on 5510 hospitalizations demonstrated AUROC values of 0.83–0.85. The model outperformed the Braden Scale in sensitivity and F1-score and showed superior performance compared to previous predictive models. Conclusions: This is the first externally validated, cross-institutional HAPI prediction model using longitudinal EMR data and automated pipelines. The model demonstrates strong generalizability, scalability, and real-time applicability, offering a novel bioengineering approach to improve HAPI prevention, patient care, and clinical operations.
Journal Article
Desargues cloud TPS: a cloud-based automatic radiation treatment planning system for IMRT
2026
Purpose
To develop a cloud-based automated treatment planning system for intensity-modulated radiation therapy and evaluate its efficacy and safety for tumors in various anatomical sites under general clinical scenarios.
Results
All the plans from both groups satisfy the PTV prescription dose coverage requirement of at least 95% of the PTV volume. The mean HI of plan A group and plan B group is 0.084 and 0.081, respectively, with no statistically significant difference from those of plan C group. The mean CI, PQM, OOT and POT are 0.806, 77.55. 410 s and 185 s for plan A group, and 0.841, 76.87, 515.1 s and 271.1 s for plan B group, which were significantly superior than those of plan C group except for the CI of plan A group. There is no statistically significant difference between the dose accuracies of plan B and plan C groups.
Conclusions
It is concluded that the overall efficacy and safety of the Desargues Cloud TPS are not significantly different to those of Varian Eclipse, while some efficacy indicators of plans generated from automatic planning without or with manual adjustments are even significantly superior to those of fully manual plans from Eclipse. The cloud-based automatic treatment planning additionally increase the efficiency of treatment planning process and facilitate the sharing of planning knowledge.
Materials and methods
The cloud-based automatic radiation treatment planning system, Desargues Cloud TPS, was designed and developed based on browser/server mode, where all the computing intensive functions were deployed on the server and user interfaces were implemented on the web. The communication between the browser and the server was through the local area network (LAN) of a radiotherapy institution. The automatic treatment planning module adopted a hybrid of both knowledge-based planning (KBP) and protocol-based automatic iterative optimization (PB–AIO), consisting of three steps: beam angle optimization (BAO), beam fluence optimization (BFO) and machine parameter optimization (MPO). 53 patients from two institutions have been enrolled in a multi-center self-controlled clinical validation. For each patient, three IMRT plans were designed. The plan A and B were designed on Desargues Cloud TPS using automatic planning without and with manual adjustments, respectively. The plan C was designed on Varian Eclipse TPS using fully manual planning. The efficacy indicators were heterogeneous index, conformity index, plan quality metric, overall operation time and plan optimization time. The safety indicators were gamma indices of dose verification.
Journal Article
Multi-Center Evaluation of Gel-Based and Dry Multipin EEG Caps
by
Vasconcelos, Beatriz
,
Supriyanto, Eko
,
Fonseca, Carlos
in
Analysis
,
Biofeedback training
,
dry electrodes
2022
Dry electrodes for electroencephalography (EEG) allow new fields of application, including telemedicine, mobile EEG, emergency EEG, and long-term repetitive measurements for research, neurofeedback, or brain–computer interfaces. Different dry electrode technologies have been proposed and validated in comparison to conventional gel-based electrodes. Most previous studies have been performed at a single center and by single operators. We conducted a multi-center and multi-operator study validating multipin dry electrodes to study the reproducibility and generalizability of their performance in different environments and for different operators. Moreover, we aimed to study the interrelation of operator experience, preparation time, and wearing comfort on the EEG signal quality. EEG acquisitions using dry and gel-based EEG caps were carried out in 6 different countries with 115 volunteers, recording electrode-skin impedances, resting state EEG and evoked activity. The dry cap showed average channel reliability of 81% but higher average impedances than the gel-based cap. However, the dry EEG caps required 62% less preparation time. No statistical differences were observed between the gel-based and dry EEG signal characteristics in all signal metrics. We conclude that the performance of the dry multipin electrodes is highly reproducible, whereas the primary influences on channel reliability and signal quality are operator skill and experience.
Journal Article
Development and multi-center validation of machine learning models based on targeted metabolomics for rheumatoid arthritis
by
Gao, Huali
,
Sheng, Huiming
,
Jiang, Renquan
in
Biomarkers
,
Biomedical and Life Sciences
,
Biomedicine
2025
Background
Rheumatoid arthritis (RA) remains in urgent need of more effective biomarkers to improve diagnostic accuracy.
Methods
In this study, we conducted a comprehensive analysis of 2,863 blood samples obtained from seven cohorts comprising RA, osteoarthritis (OA), and healthy control (HC) subjects, recruited across five medical centers spanning three geographically diverse regions. Candidate biomarkers were first identified through untargeted metabolomic profiling, and subsequently validated using targeted approaches. Metabolite-based classification models were then developed employing a range of machine learning algorithms.
Results
Six metabolites were ultimately identified as promising diagnostic biomarkers, including imidazoleacetic acid, ergothioneine, N-acetyl-L-methionine, 2-keto-3-deoxy-D-gluconic acid, 1-methylnicotinamide and dehydroepiandrosterone sulfate. Based on these metabolites, we constructed classification models to differentiate RA from both HC and OA groups, and evaluated their performance across multiple independent validation cohorts. In three geographically distinct cohorts, RA vs. HC classifiers demonstrated robust discriminatory power, with an area under the receiver operating characteristic curve (AUC) ranging from 0.8375 to 0.9280, while RA vs. OA classifiers achieved moderate to good accuracy (AUC range: 0.7340–0.8181). Importantly, analysis of the seronegative RA subgroup indicated that the classifier’s performance was independent of serological status. Furthermore, validations conducted across different sample types and analytical platforms confirmed the reproducibility and stability of the models.
Conclusions
Taken together, these findings highlight the utility of metabolomics as a complementary approach for improving RA diagnosis and establish a broadly applicable framework for the development of metabolite-based classifiers across diverse and clinically heterogeneous disease contexts.
Journal Article
Development and external validation of a machine learning-based prognostic model for small cell neuroendocrine cervical carcinoma: a multi-center study
2025
Background
Small cell neuroendocrine cervical carcinoma (SCNECC) is a rare malignancy with a poor prognosis. The prognostic factors influencing SCNECC remain unclear. This study aimed to develop a prognostic model for SCNECC using machine learning (ML) techniques.
Methods
We collected 487 patients diagnosed with SCNECC in the SEER database from 2004 to 2021, dividing them into a training set and an internal validation set at a 7:3 ratio. Additionally, we gathered 300 SCNECC patients from 3 Chinese registries between 2005 and 2023 as an external validation set. Initially, we performed univariate Cox regression analyses on 22 candidate variables using the Mime package. Variables with a
p
-value < 0.05 were included. Subsequently, to determine the optimal prognostic model, a total of 10 commonly used ML algorithms were collected and subsequently combined into 117 unique combinations. Finally, we validated the best model's performance using multiple independent cohorts, assessing metrics such as the concordance index (C-index), calibration curves, time-dependent receiver operating characteristic curves (ROC curves), and decision curve analyses (DCA).
Results
The Stepwise Cox (StepCox) [forward] + Random Survival Forest (RSF) (SCR) model demonstrated the best predictive performance, with a C-index of 0.84 in the development set, 0.75 in the internal validation set, and 0.68 in the external validation set. It showed high prognostic value for 1-, 3-, and 5-year survival in SCNECC patients. SHAP-based interpretability analysis identified twenty key predictors that collectively enhanced the model's robustness.
Conclusion
The SCR model has potential in predicting the prognosis of SCNECC, providing clinicians with decision support to identify high-risk patients, optimize treatment strategies, and ultimately improve clinical outcomes.
Journal Article
A new automatic algorithm for quantification of myocardial infarction imaged by late gadolinium enhancement cardiovascular magnetic resonance: experimental validation and comparison to expert delineations in multi-center, multi-vendor patient data
2016
Late gadolinium enhancement (LGE) cardiovascular magnetic resonance (CMR) using magnitude inversion recovery (IR) or phase sensitive inversion recovery (PSIR) has become clinical standard for assessment of myocardial infarction (MI). However, there is no clinical standard for quantification of MI even though multiple methods have been proposed. Simple thresholds have yielded varying results and advanced algorithms have only been validated in single center studies. Therefore, the aim of this study was to develop an automatic algorithm for MI quantification in IR and PSIR LGE images and to validate the new algorithm experimentally and compare it to expert delineations in multi-center, multi-vendor patient data.
The new automatic algorithm, EWA (Expectation Maximization, weighted intensity, a priori information), was implemented using an intensity threshold by Expectation Maximization (EM) and a weighted summation to account for partial volume effects.
The EWA algorithm was validated in-vivo against triphenyltetrazolium-chloride (TTC) staining (n = 7 pigs with paired IR and PSIR images) and against ex-vivo high resolution T1-weighted images (n = 23 IR and n = 13 PSIR images). The EWA algorithm was also compared to expert delineation in 124 patients from multi-center, multi-vendor clinical trials 2–6 days following first time ST-elevation myocardial infarction (STEMI) treated with percutaneous coronary intervention (PCI) (n = 124 IR and n = 49 PSIR images).
Infarct size by the EWA algorithm in vivo in pigs showed a bias to ex-vivo TTC of −1 ± 4%LVM (R = 0.84) in IR and −2 ± 3%LVM (R = 0.92) in PSIR images and a bias to ex-vivo T1-weighted images of 0 ± 4%LVM (R = 0.94) in IR and 0 ± 5%LVM (R = 0.79) in PSIR images. In multi-center patient studies, infarct size by the EWA algorithm showed a bias to expert delineation of −2 ± 6 %LVM (R = 0.81) in IR images (n = 124) and 0 ± 5%LVM (R = 0.89) in PSIR images (n = 49).
The EWA algorithm was validated experimentally and in patient data with a low bias in both IR and PSIR LGE images. Thus, the use of EM and a weighted intensity as in the EWA algorithm, may serve as a clinical standard for the quantification of myocardial infarction in LGE CMR images.
CHILL-MI: NCT01379261. MITOCARE: NCT01374321.
Journal Article
Domain Shift in Breast DCE-MRI Tumor Segmentation: A Balanced LoCoCV Study on the MAMA-MIA Dataset
2026
Background and Objectives: Accurate breast tumor segmentation in dynamic contrast-enhanced MRI (DCE-MRI) is crucial for treatment planning, therapy monitoring, and quantitative studies of breast cancer response. However, deep learning models often have worse performance when applied to new hospitals because scanner hardware, acquisition protocols, and patient populations differ from those in the training data. This study investigates how such center-related domain shift affects automated breast DCE-MRI tumor segmentation on the multi-center MAMA-MIA dataset. Methods: We trained a standard 3D U-Net for primary tumor segmentation under two evaluation settings. First, we constructed a random patient-wise split that mixes cases from the three main MAMA-MIA center groups (ISPY2, DUKE, NACT) and used this as an in-distribution reference. Second, we designed a balanced leave-one-center-out cross-validation (LoCoCV) protocol in which each center is held out in turn, while training, validation, and test sets are matched in size across folds. Performance was assessed using the Dice similarity coefficient, 95th percentile Hausdorff distance (HD95), sensitivity, specificity, and related overlap measures. Results: On the mixed-center random split, the best three-channel model achieved a mean Dice of about 0.68 and a mean HD95 of about 19.7 mm on the held-out test set, indicating good volumetric overlap and boundary accuracy when training and test distributions match. Under balanced LoCoCV, the one-channel model reached a mean Dice of about 0.45 and a mean HD95 of about 41 mm on unseen centers, with similar averages for the three-channel variant. Compared with the random split baseline, Dice and sensitivity decreased, while HD95 nearly doubled, showing that boundary errors become larger and segmentations less reliable when the model is applied to new centers. Conclusions: A model that performs well on mixed-center random splits can still suffer a substantial loss of accuracy on completely unseen institutions. The balanced LoCoCV design makes this out-of-distribution penalty visible by separating center-related effects from sample size effects. These findings highlight the need for robust multi-center training strategies and explicit cross-center validation before deploying breast DCE-MRI segmentation models in clinical practice.
Journal Article