Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
3,647 result(s) for "prediction validation"
Sort by:
Real-Time Auto-Monitoring of Livestock: Quantitative Framework and Challenges
The use of automated sensors has grown rapidly in recent years, with sensor data now routinely used for monitoring in a wide range of situations, including human health and behaviour, the environment, wildlife, and agriculture. Livestock farming is a key area of application, and our primary focus here, but the issues discussed are widely applicable. There is the potential to massively increase the use of empirical data for decision-making in real time, and a range of quantitative methods, including machine learning and statistical methods, have been proposed for this purpose within the literature. In many areas, however, development and validation of quantitative approaches are still needed in order for these methods to effectively inform decision-making. Within the context of livestock farming, for example, it must be practically feasible to repeatedly apply the method dynamically in real time on farms in order to optimise decision-making, and we discuss the challenges in using quantitative approaches for this purpose. It is also crucial to evaluate and compare the applied performance of methods in a fair and robust way—such comparisons are currently lacking within the literature on livestock farming, and we outline approaches to addressing this key gap.
What drove giant panda Ailuropoda melanoleuca expansion in the Qinling Mountains? An analysis comparing the influence of climate, bamboo, and various landscape variables in the past decade
The role of climate and aclimatic factors on species distribution has been debated widely among ecologists and conservationists. It is often difficult to attribute empirically observed changes in species distribution to climatic or aclimatic factors. Giant pandas (A. melanoleuca) provide a rare opportunity to study the impact of climatic and aclimatic factors, particularly the food sources on predicting the distribution changes in the recent decade, as well-documented information on both giant panda and bamboos exist. Here, we ask how the climate metrics compare to the bamboo suitability metric in predicting the giant panda occurrences outside the central areas in the Qinling Mountains during the past decade. We also seek to understand the relative importance of different landscape-level variables in predicting giant panda emigration outside areas of high giant panda densities. We utilize data from the 3rd and 4th National Giant Panda Surveys (NGPSs) for our analysis. We evaluate the performance of the species distribution models trained by climate, bamboo suitability, and the combination of the two. We then at 4 spatial scales identify the optimal models for predicting giant panda emigration between the 3rd and the 4th NGPSs using a list of landscape-level environmental variables. Our results show that the models utilizing the bamboo suitability alone consistently outperform the bioclimatic and the combined models; the distance to high giant panda density core area and bamboo suitability show high importance in predicting expansion probability across all four scales. Our results also suggest that the extrapolated bamboo distribution using bamboo occurrence data can provide a practical and more reliable alternative to predict potential expansion and emigration of giant panda along the range edge. It suggests that restoring bamboo forests within the vicinity of high giant panda density areas is likely a more reliable strategy for supporting shifting giant panda populations.
External validation of a 2-year all-cause mortality prediction tool developed using machine learning in patients with stage 4-5 chronic kidney disease
BACKGROUND: Chronic kidney disease (CKD) is associated with increased mortality. Individual mortality prediction could be of interest to improve individual clinical outcomes. Using an independent regional dataset, the aim of the present study was to externally validate the recently published 2-year all-cause mortality prediction tool developed using machine learning. METHODS: A validation dataset of stage 4 or 5 CKD outpatients was used. External validation performance of the prediction tool at the optimal cutoff-point was assessed by the area under the receiver operating characteristic curve (AUC-ROC), accuracy, sensitivity, and specificity. A survival analysis was then performed using the Kaplan-Meier method. RESULTS: Data of 527 outpatients with stage 4 or 5 CKD were analyzed. During the 2 years of follow-up, 91 patients died and 436 survived. Compared to the learning dataset, patients in the validation dataset were significantly younger, and the ratio of deceased patients in the validation dataset was significantly lower. The performance of the prediction tool at the optimal cutoff-point was: AUC-ROC = 0.72, accuracy = 63.6%, sensitivity = 72.5%, and specificity = 61.7%. The survival curves of the predicted survived and the predicted deceased groups were significantly different (p \\textless 0.001). CONCLUSION: The 2-year all-cause mortality prediction tool for patients with stage 4 or 5 CKD showed satisfactory discriminatory capacity with emphasis on sensitivity. The proposed prediction tool appears to be of clinical interest for further development.
Comparison of the diagnostic performance of twelve noninvasive scores of metabolic dysfunction-associated fatty liver disease
Background The absence of distinct symptoms in the majority of individuals with metabolic dysfunction-associated fatty liver disease (MAFLD) poses challenges in identifying those at high risk, so we need simple, efficient and cost-effective noninvasive scores to aid healthcare professionals in patient identification. While most noninvasive scores were developed for the diagnosis of nonalcoholic fatty liver disease (NAFLD), consequently, the objective of this study was to systematically assess the diagnostic ability of 12 noninvasive scores (METS-IR/TyG/TyG-WC/TyG-BMI/TyG-WtHR/VAI/HSI/FLI/ZJU/FSI/K-NAFLD) for MAFLD. Methods The study recruited eligible participants from two sources: the National Health and Nutrition Examination Survey (NHANES) 2017-2020.3 cycle and the database of the West China Hospital Health Management Center. The performance of the model was assessed using various metrics, including area under the receiver operating characteristic curve (AUC), net reclassification index (NRI), integrated discrimination improvement (IDI), decision curve analysis (DCA), and subgroup analysis. Results A total of 7398 participants from the NHANES cohort and 4880 patients from the Western China cohort were included. TyG-WC had the best predictive power for MAFLD risk in the NHANES cohort (AUC 0.863, 95% CI 0.855–0.871), while TyG-BMI had the best predictive ability in the Western China cohort (AUC 0.903, 95% CI 0.895–0.911), outperforming other models, and in terms of IDI, NRI, DCA, and subgroup analysis combined, TyG-WC remained superior in the NAHANES cohort and TyG-BMI in the Western China cohort. Conclusions TyG-BMI demonstrated satisfactory diagnostic efficacy in identifying individuals at a heightened risk of MAFLD in Western China. Conversely, TyG-WC exhibited the best diagnostic performance for MAFLD risk recognition in the United States population. These findings suggest the necessity of selecting the most suitable predictive models based on regional and ethnic variations.
Evaluation of machine learning and logistic regression-based gestational diabetes prognostic models
This study aimed to follow best practice by temporally evaluating existing gestational diabetes mellitus (GDM) prediction models, updating them where needed, and comparing the temporal evaluation performance of the machine learning (ML)-based models with that of regression-based models. We utilized new data for the temporal validation dataset with 12,722 singleton pregnancies at the Monash Health Network from 2021 to 2022. The Monash GDM Logistic Regression (LR) model with six categorical variables (version 2) and the Monash GDM ML model (version 3), along with an extended LR GDM model (version 3), each with eight categorical and continuous variables, were evaluated. Model performance was assessed using discrimination and calibration. Decision curve analyses (DCA) were performed to determine the net benefit of models. Recalibration was considered to improve model performance. The development datasets for model versions 2, 3, and the new temporal validation dataset included 21.2%, 22.5%, and 33.5% of pregnant women aged ≥35 years, respectively; 22%, 23.7%, and 24.0% with a body mass index ≥30 kg/m2; and GDM prevalence rates of 18%, 21.3%, and 28.6%, respectively. There was similar discrimination performance across the models, with area under the receiver operating characteristic curve (AUC) of 0.72 [95% CI: 0.71, 0.73], 0.73 [95% CI: 0.72, 0.74], and 0.73 [95% CI: 0.73, 0.74] for version 2 and version 3 ML and LR models, respectively. All models exhibited overestimation with calibration slopes of 0.87, 0.99, and 0.87, respectively, which improved with recalibration. DCA showed that all models had better net benefits as compared to treat all and treat none. For all models, some variability has been observed in prediction performance across ethnic groups and parity. Despite significant changes in the background characteristics of the population, we have demonstrated that all models remained robust, especially after recalibration. However, the performance of the original ML model decreased significantly during validation. Dynamic models are better suited to adapt to the temporal changes in baseline characteristics of pregnant women and the resulting calibration drift, as they can incorporate new data without requiring manual evaluation. [Display omitted] •Risk prediction models to identify women early in pregnancy with GDM are needed.•Rigorous evaluations of GDM models are essential to advance the field.•Extends prior temporal evaluations to include new machine learning models.•Uses recalibration to address patient heterogeneity and temporal changes.•Fills gap in advancing prediction tools for better care of pregnant women.
Systematic validation of experimental data usable for verifying the multiaxial fatigue prediction methods
The paper discusses some of the issues, the researchers interested in verifying various multiaxial fatigue limit estimation solutions are facing to. Even recently, newly proposed criteria have been or are tested on dozens of experimental inputs. Papuga in [1] pointed out, that applicability of the most often used test batch is limited and only half of these data items is worth using for such purposes. This paper extends that analysis by describing the weak points of various data sets used in this domain for validating new proposals on multiaxial fatigue limit estimates. The conclusion from the extensive analysis is that the researchers should adopt other test sets only if they very well know their background
Predictions of Vortex Flow in a Diesel Multi-Hole Injector Using the RANS Modelling Approach
The occurrence of vortices in the sac volume of automotive multi-hole fuel injectors plays an important role in the development of vortex cavitation, which directly influences the flow structure and emerging sprays that, in turn, influence the engine performance and emissions. In this study, the RANS-based turbulence modelling approach was used to predict the internal flow in a vertical axis-symmetrical multi-hole (6) diesel fuel injector under non-cavitating conditions. The project aimed to predict the aforementioned vortical structures accurately at two different needle lifts in order to form a correct opinion about their occurrence. The accuracy of the simulations was assessed by comparing the predicted mean axial velocity and RMS velocity of LDV measurements, which showed good agreement. The flow field analysis predicted a complex, 3D, vortical flow structure with the presence of different types of vortices in the sac volume and the nozzle hole. Two main types of vortex were detected: the “hole-to-hole” connecting vortex, and double “counter-rotating” vortices emerging from the needle wall and entering the injector hole facing it. Different flow patterns in the rotational direction of the “hole-to-hole” vortices have been observed at the low needle lift (anticlockwise) and full needle lift (clockwise), due to their different flow passages in the sac, causing a much higher momentum inflow at the lower lift with its much narrower flow passage.
Systematic validation of experimental data usable for verifying the multiaxial fatigue prediction methods
The paper discusses some of the issues, the researchers interested in verifying various multiaxial fatigue limit estimation solutions are facing to. Even recently, newly proposed criteria have been or are tested on dozens of experimental inputs. Papuga in [1] pointed out, that applicability of the most often used test batch is limited and only half of these data items is worth using for such purposes. This paper extends that analysis by describing the weak points of various data sets used in this domain for validating new proposals on multiaxial fatigue limit estimates. The conclusion from the extensive analysis is that the researchers should adopt other test sets only if they very well know their background. 
Validation and comparison of triage-based screening strategies for sepsis
This study sought to externally validate and compare proposed methods for stratifying sepsis risk at emergency department (ED) triage. This nested case/control study enrolled ED patients from four hospitals in Utah and evaluated the performance of previously-published sepsis risk scores amenable to use at ED triage based on their area under the precision-recall curve (AUPRC, which balances positive predictive value and sensitivity) and area under the receiver operator characteristic curve (AUROC, which balances sensitivity and specificity). Score performance for predicting whether patients met Sepsis-3 criteria in the ED was compared to patients' assigned ED triage score (Canadian Triage Acuity Score [CTAS]) with adjustment for multiple comparisons. Among 2000 case/control patients, 981 met Sepsis-3 criteria on final adjudication. The best performing sepsis risk scores were the Predict Sepsis version #3 (AUPRC 0.183, 95 % CI 0.148–0.256; AUROC 0.859, 95 % CI 0.843–0.875) and Borelli scores (AUPRC 0.127, 95 % CI 0.107–0.160, AUROC 0.845, 95 % CI 0.829–0.862), which significantly outperformed CTAS (AUPRC 0.038, 95 % CI 0.035–0.042, AUROC 0.650, 95 % CI 0.628–0.671, p < 0.001 for all AUPRC and AUROC comparisons). The Predict Sepsis and Borelli scores exhibited sensitivity of 0.670 and 0.678 and specificity of 0.902 and 0.834, respectively, at their recommended cutoff values and outperformed Systemic Inflammatory Response Syndrome (SIRS) criteria (AUPRC 0.083, 95 % CI 0.070–0.102, p = 0.052 and p = 0.078, respectively; AUROC 0.775, 95 % CI 0.756–0.795, p < 0.001 for both scores). The Predict Sepsis and Borelli scores exhibited improved performance including increased specificity and positive predictive values for sepsis identification at ED triage compared to CTAS and SIRS criteria.
Weighted metrics are required when evaluating the performance of prediction models in nested case–control studies
Background Nested case–control (NCC) designs are efficient for developing and validating prediction models that use expensive or difficult-to-obtain predictors, especially when the outcome is rare. Previous research has focused on how to develop prediction models in this sampling design, but little attention has been given to model validation in this context. We therefore aimed to systematically characterize the key elements for the correct evaluation of the performance of prediction models in NCC data. Methods We proposed how to correctly evaluate prediction models in NCC data, by adjusting performance metrics with sampling weights to account for the NCC sampling. We included in this study the C-index, threshold-based metrics, Observed-to-expected events ratio (O/E ratio), calibration slope, and decision curve analysis. We illustrated the proposed metrics with a validation of the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA version 5) in data from the population-based Rotterdam study. We compared the metrics obtained in the full cohort with those obtained in NCC datasets sampled from the Rotterdam study, with and without a matched design. Results Performance metrics without weight adjustment were biased: the unweighted C-index in NCC datasets was 0.61 (0.58–0.63) for the unmatched design, while the C-index in the full cohort and the weighted C-index in the NCC datasets were similar: 0.65 (0.62–0.69) and 0.65 (0.61–0.69), respectively. The unweighted O/E ratio was 18.38 (17.67–19.06) in the NCC datasets, while it was 1.69 (1.42–1.93) in the full cohort and its weighted version in the NCC datasets was 1.68 (1.53–1.84). Similarly, weighted adjustments of threshold-based metrics and net benefit for decision curves were unbiased estimates of the corresponding metrics in the full cohort, while the corresponding unweighted metrics were biased. In the matched design, the bias of the unweighted metrics was larger, but it could also be compensated by the weight adjustment. Conclusions Nested case–control studies are an efficient solution for evaluating the performance of prediction models that use expensive or difficult-to-obtain biomarkers, especially when the outcome is rare, but the performance metrics need to be adjusted to the sampling procedure.