Catalogue Search | MBRL

Beyond prediction intervals in meta-analysis: reporting the expected proportion of comparable studies with clinically relevant benefit or harm

by Evrenoglou, T. , Schwarzer, G. , Meerpohl, J.J. in Analysis , Cesarean Section - statistics & numerical data , Data Interpretation, Statistical

2025

Background In a meta-analysis where the effect size varies substantially between studies it is important to report the extent of the variation. Critically, we want to know if the treatment is always helpful or sometimes harmful. The statistic that addresses this is the prediction interval (PI), which gives the range of true effects for all studies comparable to those in the meta-analysis. Methods In addition to the PI’s upper and lower limits, we propose to report the expected proportion of comparable studies that are expected to have an effect in a given range. If we define for example thresholds corresponding to minimal clinically important benefit and harm, we can report the expected proportion of comparable studies where the true effect is expected to exceed these thresholds. Results We apply our approach to two Cochrane Reviews assessing a dichotomous and a continuous outcome: caesarean section and health-related quality of life. This article shows how to plot the distribution of true study effects highlighting the expected proportion of comparable studies where the true effect is clinically beneficial or harmful. We also offer suggestions for how to report this information in scientific articles. Conclusion In addition to PIs, reporting the expected proportion of comparable studies with relevant benefit or harm as supplementary information could help physicians and other decision-makers to understand the potential utility of an intervention. However, these metrics must be interpreted with caution because the estimate of the between‑study heterogeneity may be imprecise when data are limited.

Journal Article

Share this book

Add to My Shelf

Prediction of electricity price intervals using dynamic bayesian networks

by Wang, Hongtao in Accuracy , Alternative energy sources , Bayesian analysis

2025

The increasing volatility of electricity prices, driven by the growing share of renewable energy, calls for new approaches. This paper proposes a dynamic Bayesian network (DBN) method for electricity price interval forecasting. The model uses predicted values of wind power generation, total power generation, and total electricity consumption, along with historical electricity prices, as inputs. The network structure is determined using a greedy search algorithm, and the model parameters are estimated through maximum likelihood estimation (MLE). By treating the predictions of wind power, total generation, and total consumption as reasoning evidence, the method employs joint tree inference to generate discrete states and posterior probabilities for electricity prices, thereby enabling interval forecasting. The DBN-based interval predictions achieve a prediction interval coverage probability (PICP) of 95.24%, a normalized average width (PINAW) of 9.25%, and an accumulated width deviation (AWD) of 0.56%. The effectiveness of the proposed method was evaluated by comparing its predictions with actual electricity prices and with results from both particle swarm optimization-kernel extreme learning machine (PSO-KELM) and long short-term memory (LSTM)-based methods. This innovative approach not only provides prediction intervals but also associates them with corresponding probabilities, offering significant potential to enhance market participants’ decision-making and mitigate price risks.

Journal Article

Share this book

Add to My Shelf

Uncertainty Evaluation in Hydrological Frequency Analysis Based on Confidence Interval and Prediction Interval

by Yamada, Tadashi , Yamada, Tomohito J. , Shimizu, Keita in Bayesian theory , climate , Climate change

2020

The shortage of extreme rainfall data gives substantial uncertainty to design rainfalls and causes predictions for torrential rainfall to deviate strongly from adopted probability distributions used in river planning. These torrential rainfalls are treated as outliers which existing studies do not evaluate. However, probability limit method test which its acceptance region expresses with high accuracy the range where observed ith order statistics could realize. Confidence interval which quantifies uncertainty of adopted distributions can be constructed by assuming that these critical values in both sides of the adopted region follow the same function form applied to actual observed data. Furthermore, its validity is proved through comparison of confidence interval derived from ensemble downscaling calculations. In addition, these critical values are almost in accordance with outliers in samples from the ensemble downscaling calculations. Therefore, prediction interval which expresses the range that an unknown observed datum can take is constructed by extrapolating the critical values for limit estimation of a future datum. In this paper, quantification method of uncertainty of design rainfall and occurrence risk of outliers in the traditional framework, using the proposed confidence interval and prediction interval, is shown. Moreover, their application to future climate by using Bayesian statistics is explained.

Journal Article

Share this book

Add to My Shelf

Predictive uncertainty assessment in flood forecasting using quantile regression

by M. K., Amina , N. R, Chithra in Artificial neural networks , average relative interval length (aril) , Boundary conditions

2023

Floods and their associated impacts are topics of concern in land development planning and management, which call for efficient flood forecasting and warning systems. The performance of flood warning systems is affected by uncertainty in water level forecasts, which is due to their inability to measure or calculate a modeled value accurately. Predictive uncertainty is an emerging type of uncertainty modeling technique that emphasizes total uncertainty quantified as a probability distribution conditioned on all available knowledge. Predictive uncertainty analysis was done using quantile regression (QR) for machine learning-based flood models – Hybrid Wavelet Artificial Neural Network model (WANN) and Hybrid Wavelet Support Vector Machine model (WSVM) for different lead times. Comparing QR models of WANN and WSVM revealed that the slope, intercept, spread of forecast, and width of confidence band of the WANN model are more for each quantile indicating more uncertainty as compared to the WSVM model. In both models, with an increase in lead time, uncertainty has shown an increasing trend as well. The performance evaluation of inference obtained from QR models was evaluated using uncertainty statistics such as prediction interval coverage probability, average relative interval length (ARIL), and mean prediction interval (MPI).

Journal Article

Share this book

Add to My Shelf

Clinical Bedside Benchmarking Test for Measuring the Total Hemoglobin Concentration

by Dörries, Frank , Gehring, Hartmut , Niemuth, Stefan S. in Accuracy , Blood gas analysis , Health aspects

2025

Objective: Accurate total hemoglobin concentration (ctHb) measurement is critical for clinical decision-making, particularly in acute care, where immediate therapeutic decisions are required. This study evaluated previously established laboratory-based accuracy criteria for ctHb measurements in routine clinical practice at an interdisciplinary operative intensive care unit (IO-ICU), and with particular attention to significantly reduced hemoglobin concentrations. Method: Remaining blood from blood gas analysis (BGA) cuvettes was collected directly at the ICU bedside. From these initial samples, three clinically relevant measurement scenarios were established: direct bedside measurement (Group 01), elevated ctHb levels (Group 02), and lowered ctHb concentrations below 9 g/dl (Group 03). The samples were analyzed using the GEM 4000, GEM 5000 (Werfen GmbH, Muenchen, Germany), ABL90 Flex plus (Radiometer GmbH, Krefeld, Germany), HemoCue Hb 201+, and XN 9000/9100 (Sysmex Deutschland GmbH, Norderstedt, Germany) automatic hematology analyzers. Since each measurement device inherently possesses systematic deviations, no single analyzer was defined as an absolute reference. Instead, the mean value across all tested measurement systems was utilized as a best-fit reference (REF) value. Results: A total of 120 data pairs from 40 ICU patients were analyzed using regression analyses, Bland and Altman (B&A) methods, and tolerance level analysis (TLA). The results demonstrated strong concordance among the evaluated measurement devices across the examined ctHb spectrum (~1–18 g/dL). Moderate systematic deviations identified by B&A analysis were most pronounced at critically low ctHb levels (<6 g/dL). A key outcome was the determination of 95% prediction intervals (PIs), representing a quantifiable range of uncertainties for future bedside measurements. The PIs for Group 03 “low” were in the range of ±7% (relative difference) or ±0.38 g/dL (absolute difference). Conclusion: This study effectively translates previous laboratory findings into clinical practice, highlighting the practical utility of PIs to guide the accurate interpretation of bedside ctHb measurements under acute care conditions.

Journal Article

Share this book

Add to My Shelf

Single-Objective and Multi-Objective Flood Interval Forecasting Considering Interval Fitting Coefficients

by Chang, Xinyu , Ren, Pingan , Guo, Jun in Accuracy , Algorithms , Climate change

2024

Human activities and climate change have exacerbated the frequency of extreme weather events such as rainstorms and floods, which makes it difficult to accurately quantify the uncertainty characteristics in runoff prediction. Therefore, the lower and upper boundary estimation method (LUBE) has become an important means to quantify uncertainty and has been widely used. However, the traditional interval prediction evaluation system only relies on coverage and width indicators, and performs poorly in single-objective optimization methods, which limits the large-scale application of the LUBE method. Based on this, this study innovatively proposes the prediction interval fitting coefficient (PIFC), and combines the prediction interval coverage probability (PICP) and normalized average width index (PINAW) to construct the coverage width fitting-based criterion (CWFC) for the first time, which broadens and improves the interval prediction evaluation dimension system. Further, the single-objective and multi-objective LUBE interval forecasting models based on the randomized weighted particle swarm algorithm (RWPSO) and the non-dominated sorting genetic algorithms III (NSGA-III) are constructed in this study. The verification results of cascade hydropower stations in the Yalong river basin show that the calculation efficiency and prediction effect of the single target interval prediction model are both improved after the introduction of PIFC. Under the CWFC objective function, the PINAW and PIFC indexes in the prediction interval are significantly better, and the PICP gap is smaller. Under multi-objective conditions (PICP, PINAW and PIFC), the Pareto non-inferior solution set can provide more choices for decision makers. During the flood season, PICP can reach more than 93%, PINAW is controlled below 10%, and PIFC can reach more than 0.95. This fully proves that the performance of interval prediction has been significantly improved after the introduction of PIFC, and the research results can provide a new way for basin interval prediction.

Journal Article

Share this book

Add to My Shelf

Distributional conformal prediction

by Wüthrich, Kaspar , Zhu, Yinchu , Chernozhukov, Victor in Integral transforms , Intervals , Physical Sciences

2021

We propose a robust method for constructing conditionally valid prediction intervals based on models for conditional distributions such as quantile and distribution regression. Our approach can be applied to important prediction problems, including cross-sectional prediction, k–step-ahead forecasts, synthetic controls and counterfactual prediction, and individual treatment effects prediction. Our method exploits the probability integral transform and relies on permuting estimated ranks. Unlike regression residuals, ranks are independent of the predictors, allowing us to construct conditionally valid prediction intervals under heteroskedasticity. We establish approximate conditional validity under consistent estimation and provide approximate unconditional validity under model misspecification, under overfitting, and with time series data. We also propose a simple “shape” adjustment of our baseline method that yields optimal prediction intervals.

Journal Article

Share this book

Add to My Shelf

Valid prediction intervals for regression problems

by Dewolf, Nicolas , Waegeman, Willem , Baets, Bernard De in Bayesian analysis , Calibration , Comparative analysis

2023

Over the last few decades, various methods have been proposed for estimating prediction intervals in regression settings, including Bayesian methods, ensemble methods, direct interval estimation methods and conformal prediction methods. An important issue is the validity and calibration of these methods: the generated prediction intervals should have a predefined coverage level, without being overly conservative. So far, no study has analysed this issue whilst simultaneously considering these four classes of methods. In this independent comparative study, we review the above four classes of methods from a conceptual and experimental point of view in the i.i.d. setting. Results on benchmark data sets from various domains highlight large fluctuations in performance from one data set to another. These observations can be attributed to the violation of certain assumptions that are inherent to some classes of methods. We illustrate how conformal prediction can be used as a general calibration procedure for methods that deliver poor results without a calibration step.

Journal Article

Share this book

Add to My Shelf

SimBoost: a read-across approach for predicting drug–target binding affinities using gradient boosting machines

by Ban, Fuqiang , Cherkasov, Artem , Heidemeyer, Marten in Applicability Domain , Benchmarks , Binding

2017

Computational prediction of the interaction between drugs and targets is a standing challenge in the field of drug discovery. A number of rather accurate predictions were reported for various binary drug–target benchmark datasets. However, a notable drawback of a binary representation of interaction data is that missing endpoints for non-interacting drug–target pairs are not differentiated from inactive cases, and that predicted levels of activity depend on pre-defined binarization thresholds. In this paper, we present a method called SimBoost that predicts continuous (non-binary) values of binding affinities of compounds and proteins and thus incorporates the whole interaction spectrum from true negative to true positive interactions. Additionally, we propose a version of the method called SimBoostQuant which computes a prediction interval in order to assess the confidence of the predicted affinity, thus defining the Applicability Domain metrics explicitly. We evaluate SimBoost and SimBoostQuant on two established drug–target interaction benchmark datasets and one new dataset that we propose to use as a benchmark for read-across cheminformatics applications. We demonstrate that our methods outperform the previously reported models across the studied datasets.

Journal Article

Share this book

Add to My Shelf

Key concepts in clinical epidemiology: detecting and dealing with heterogeneity in meta-analyses

by Cordero, Cynthia P. , Dans, Antonio L. in Confidence intervals , Diarrhea , Epidemiology

2021

In a meta-analysis, a question always arises. Is it worthwhile to combine estimates from studies of different populations using various formulations of an intervention, evaluating outcomes measured differently? Sometimes even study designs differ. Differences are expected in a meta-analysis. These may be negligible, and a pooled estimate of effect can guide the clinical decision. However, when the differences are large, this estimate may mislead. Effect estimates from study to study differ because of real differences (between-study variability) and because of chance (within-study variability). To combine estimates when there is heterogeneity (between-study differences are large) may not be sensible. Two complementary methods may be used to detect heterogeneity: visual inspection of the forest plot and calculating numerical measures of heterogeneity (I2 and Q). Visual inspection can show effects that are different from the rest. A large I2 (proportion of overall variability attributed to between-study variation) or a small P-value associated with Q may suggest heterogeneity. Large P-values, however, do not mean the absence of heterogeneity. It is more informative to report the confidence interval of the I2. If there is no heterogeneity, a pooled estimate of the true effect may be generated using only within-study variation (fixed-effect model). If there is substantial heterogeneity, reasons should be sought. Subgroup analysis or meta-regression using study-level characteristics may be done. Although more involved and potentially challenging, individual-level data (Individual Participant Data, IPD) may also be used. In the case of unexplained heterogeneity, both within- and between-study variation should be used to generate a pooled estimate (random-effects model). This estimate does not estimate a single true effect but estimates the average of a range of effects of the intervention on populations represented by the studies. If precise enough (narrow confidence interval), this estimate, together with the prediction interval (a measure of uncertainty in the effect one might see in a particular context), can guide clinical and policy decisions. •While differences are expected in a meta-analysis, these may be negligible, and a pooled estimate can guide the clinical decision. However, when the differences are large, this estimate may mislead.•The danger of reporting pooled estimates is that readers may overlook the overall picture—some studies having bigger effects than the other studies, some effects with different directions (harm) from the benefit shown by most studies. A careful inspection of the forest plot can help detect these differences; we refer to as heterogeneity.•Visual inspection should be used together with measures of heterogeneity–I2 and Q. High values of I2 and small P-values associated with Q may suggest heterogeneity. But large P-values do not mean the absence of heterogeneity. It is more informative to report the confidence interval of I2.•If heterogeneity is detected, an explanation must be sought, and analysis using study-level characteristics (subgroup analysis or meta-regression) may be done. Although intensive, analysis using individual-level data (Individual Participant Data) may also be done.•In case of unexplained heterogeneity, a pooled estimate using the random-effects model may be used. This estimate no longer estimates a single unknown effect but the average of the effects of the intervention in the populations represented by the studies. If precise enough (narrow confidence interval), this estimate, together with the prediction interval (a measure of uncertainty in the effect one might see in a particular context), can guide clinical and policy decisions.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter