Catalogue Search | MBRL

Graphical methods and numerical summaries for presenting results from multiple-treatment meta-analysis: an overview and tutorial

by Ioannidis, John P.A. , Ades, A.E. , Salanti, Georgia in Bayes Theorem , Bayesian meta-analysis , Biological and medical sciences

2011

To present some simple graphical and quantitative ways to assist interpretation and improve presentation of results from multiple-treatment meta-analysis (MTM). We reanalyze a published network of trials comparing various antiplatelet interventions regarding the incidence of serious vascular events using Bayesian approaches for random effects MTM, and we explore the advantages and drawbacks of various traditional and new forms of quantitative displays and graphical presentations of results. We present the results under various forms, conventionally based on the mean of the distribution of the effect sizes; based on predictions; based on ranking probabilities; and finally, based on probabilities to be within an acceptable range from a reference. We show how to obtain and present results on ranking of all treatments and how to appraise the overall ranks. Bayesian methodology offers a multitude of ways to present results from MTM models, as it enables a natural and easy estimation of all measures based on probabilities, ranks, or predictions.

Journal Article

Share this book

Add to My Shelf

Assessment of vibration of effects due to model specification can demonstrate the instability of observational associations

by Ioannidis, John P.A. , Patel, Chirag J. , Burford, Belinda in Adjustment , Biomarkers , Biostatistics

2015

Model specification—what adjusting variables are analytically modeled—may influence results of observational associations. We present a standardized approach to quantify the variability of results obtained with choices of adjustments called the “vibration of effects” (VoE). We estimated the VoE for 417 clinical, environmental, and physiological variables in association with all-cause mortality using National Health and Nutrition Examination Survey data. We selected 13 variables as adjustment covariates and computed 8,192 Cox models for each of 417 variables' associations with all-cause mortality. We present the VoE by assessing the variance of the effect size and in the −log10(P-value) obtained by different combinations of adjustments. We present whether there are multimodality patterns in effect sizes and P-values and the trajectory of results with increasing adjustments. For 31% of the 417 variables, we observed a Janus effect, with the effect being in opposite direction in the 99th versus the 1st percentile of analyses. For example, the vitamin E variant α-tocopherol had a VoE that indicated higher and lower risk for mortality. Estimating VoE offers empirical estimates of associations are under different model specifications. When VoE is large, claims for observational associations should be very cautious.

Journal Article

Share this book

Add to My Shelf

Development of the Instrument to assess the Credibility of Effect Modification Analyses (ICEMAN) in randomized controlled trials and meta-analyses

by van der Heijden, Geert J.M.G. , Sauerbrei, Willi , Varadhan, Ravi in Analysis , Clinical trials , Consensus

2020

Most randomized controlled trials (RCTs) and meta-analyses of RCTs examine effect modification (also called a subgroup effect or interaction), in which the effect of an intervention varies by another variable (e.g., age or disease severity). Assessing the credibility of an apparent effect modification presents challenges; therefore, we developed the Instrument for assessing the Credibility of Effect Modification Analyses (ICEMAN). To develop ICEMAN, we established a detailed concept; identified candidate credibility considerations in a systematic survey of the literature; together with experts, performed a consensus study to identify key considerations and develop them into instrument items; and refined the instrument based on feedback from trial investigators, systematic review authors and journal editors, who applied drafts of ICEMAN to published claims of effect modification. The final instrument consists of a set of preliminary considerations, core questions (5 for RCTs, 8 for meta-analyses) with 4 response options, 1 optional item for additional considerations and a rating of credibility on a visual analogue scale ranging from very low to high. An accompanying manual provides rationales, detailed instructions and examples from the literature. Seventeen potential users tested ICEMAN; their suggestions improved the user-friendliness of the instrument. The Instrument for assessing the Credibility of Effect Modification Analyses offers explicit guidance for investigators, systematic reviewers, journal editors and others considering making a claim of effect modification or interpreting a claim made by others.

Journal Article

Share this book

Add to My Shelf

External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination

by Siontis, George C.M. , Tzoulaki, Ioanna , Ioannidis, John P.A. in Area Under Curve , Area under the receiver operating characteristics curve , Biomarkers

2015

To evaluate how often newly developed risk prediction models undergo external validation and how well they perform in such validations. We reviewed derivation studies of newly proposed risk models and their subsequent external validations. Study characteristics, outcome(s), and models' discriminatory performance [area under the curve, (AUC)] in derivation and validation studies were extracted. We estimated the probability of having a validation, change in discriminatory performance with more stringent external validation by overlapping or different authors compared to the derivation estimates. We evaluated 127 new prediction models. Of those, for 32 models (25%), at least an external validation study was identified; in 22 models (17%), the validation had been done by entirely different authors. The probability of having an external validation by different authors within 5 years was 16%. AUC estimates significantly decreased during external validation vs. the derivation study [median AUC change: −0.05 (P < 0.001) overall; −0.04 (P = 0.009) for validation by overlapping authors; −0.05 (P < 0.001) for validation by different authors]. On external validation, AUC decreased by at least 0.03 in 19 models and never increased by at least 0.03 (P < 0.001). External independent validation of predictive models in different studies is uncommon. Predictive performance may worsen substantially on external validation.

Journal Article

Share this book

Add to My Shelf

Effect estimates of COVID-19 non-pharmaceutical interventions are non-robust and highly model-dependent

by Ioannidis, John P.A. , Chin, Vincent , Tanner, Martin A. in Bayesian analysis , Bayesian statistics , Communicable Disease Control - methods

2021

•Different SIR models developed by the same modeling team on the effectiveness of various non-pharmaceutical interventions (NPIs) for COVID-19 were compared.•The model proposing major benefits from lockdown in European countries had the worse fit to the data.•Models with better fit to the data showed little or no benefit from lockdown.•Inferences on the effects of non-pharmaceutical interventions is non-robust and depend on model specification and selection. To compare the inference regarding the effectiveness of the various non-pharmaceutical interventions (NPIs) for COVID-19 obtained from different SIR models. We explored two models developed by Imperial College that considered only NPIs without accounting for mobility (model 1) or only mobility (model 2), and a model accounting for the combination of mobility and NPIs (model 3). Imperial College applied models 1 and 2 to 11 European countries and to the USA, respectively. We applied these models to 14 European countries (original 11 plus another 3), over two different time horizons. While model 1 found that lockdown was the most effective measure in the original 11 countries, model 2 showed that lockdown had little or no benefit as it was typically introduced at a point when the time-varying reproduction number was already very low. Model 3 found that the simple banning of public events was beneficial, while lockdown had no consistent impact. Based on Bayesian metrics, model 2 was better supported by the data than either model 1 or model 3 for both time horizons. Inferences on effects of NPIs are non-robust and highly sensitive to model specification. In the SIR modeling framework, the impacts of lockdown are uncertain and highly model-dependent.

Journal Article

Share this book

Add to My Shelf

Head-to-head randomized trials are mostly industry sponsored and almost always favor the industry sponsor

by Flacco, Maria Elena , Siliquini, Roberta , Scaioli, Giacomo in Asia , Clinical trials , Confidence intervals

2015

To map the current status of head-to-head comparative randomized evidence and to assess whether funding may impact on trial design and results. From a 50% random sample of the randomized controlled trials (RCTs) published in journals indexed in PubMed during 2011, we selected the trials with ≥100 participants, evaluating the efficacy and safety of drugs, biologics, and medical devices through a head-to-head comparison. We analyzed 319 trials. Overall, 238,386 of the 289,718 randomized subjects (82.3%) were included in the 182 trials funded by companies. Of the 182 industry-sponsored trials, only 23 had two industry sponsors and only three involved truly antagonistic comparisons. Industry-sponsored trials were larger, more commonly registered, used more frequently noninferiority/equivalence designs, had higher citation impact, and were more likely to have “favorable” results (superiority or noninferiority/equivalence for the experimental treatment) than nonindustry-sponsored trials. Industry funding [odds ratio (OR) 2.8; 95% confidence interval (CI): 1.6, 4.7] and noninferiority/equivalence designs (OR 3.2; 95% CI: 1.5, 6.6), but not sample size, were strongly associated with “favorable” findings. Fifty-five of the 57 (96.5%) industry-funded noninferiority/equivalence trials got desirable “favorable” results. The literature of head-to-head RCTs is dominated by the industry. Industry-sponsored comparative assessments systematically yield favorable results for the sponsors, even more so when noninferiority designs are involved.

Journal Article

Share this book

Add to My Shelf

Meta-Analysis Comparing Established Risk Prediction Models (EuroSCORE II, STS Score, and ACEF Score) for Perioperative Mortality During Cardiac Surgery

by Wallach, Joshua D. , Sullivan, Patrick G. , Ioannidis, John P.A. in Cardiac Surgical Procedures - mortality , Cardiovascular , Cardiovascular disease

2016

A wide variety of multivariable risk models have been developed to predict mortality in the setting of cardiac surgery; however, the relative utility of these models is unknown. This study investigated the literature related to comparisons made between established risk prediction models for perioperative mortality used in the setting of cardiac surgery. A systematic review was conducted to capture studies in cardiac surgery comparing the relative performance of at least 2 prediction models cited in recent guidelines (European System for Cardiac Operative Risk Evaluation [EuroSCORE II], Society for Thoracic Surgeons 2008 Cardiac Surgery Risk Models [STS] score, and Age, Creatinine, Ejection Fraction [ACEF] score) for the outcomes of 1-month or inhospital mortality. For articles that met inclusion criteria, we extracted information on study design, predictive performance of risk models, and potential for bias. Meta-analyses were conducted to calculate a summary estimate of the difference in AUCs between models. We identified 22 eligible studies that contained 33 comparisons among the above models. Meta-analysis of differences in AUCs revealed that the EuroSCORE II and STS score performed similarly (with a summary difference in AUC = 0.00), while outperforming the ACEF score (with summary differences in AUC of 0.10 and 0.08, respectively, p <0.05). Other metrics of discrimination and calibration were presented less consistently, and no study presented any metric of reclassification. Small sample size and absent descriptions of missing data were common in these studies. In conclusion, the EuroSCORE II and STS score outperform the ACEF score on discrimination.

Journal Article

Share this book

Add to My Shelf

Excess Significance Bias in Repetitive Transcranial Magnetic Stimulation Literature for Neuropsychiatric Disorders

by Rousseau, Chloé , Ioannidis, John P. A. , Larochelle, Yann in Care and treatment , Evaluation , Life Sciences

2019

Introduction: Repetitive transcranial magnetic stimulation (rTMS) has been widely tested and promoted for use in multiple neuropsychiatric conditions, but as for many other medical devices, some gaps may exist in the literature and the evidence base for the clinical efficacy of rTMS remains under debate. Objective: We aimed to test for an excess number of statistically significant results in the literature on the therapeutic efficacy of rTMS across a wide range of meta-analyses and to characterize the power of studies included in these meta-analyses. Methods: Based on power calculations, we computed the expected number of “positive” datasets for a medium effect size (standardized mean difference, SMD = 0.30) and compared it with the number of observed “positive” datasets. Sensitivity analyses considered small (SMD = 0.20), modest (SMD = 0.50), and large (SMD = 0.80) effect sizes. Results: A total of 14 meta-analyses with 228 datasets (110 for neurological disorders and 118 for psychiatric disorders) were assessed. For SMD = 0.3, the number of observed “positive” studies (n = 94) was larger than expected (n = 35). We found evidence for an excess of significant findings overall (p < 0.0001) and in 8/14 meta-analyses. Evidence for an excess of significant findings was also observed for SMD = 0.5 for neurological disorders. Of the 228 datasets, 0 (0%), 0 (0%), 3 (1%), and 53 (23%) had a power >0.80, respectively, for SMDs of 0.30, 0.20, 0.50, and 0.80. Conclusion: Most studies in the rTMS literature are underpowered. This results in fragmentation and waste of research efforts. The somewhat high frequency of “positive” results seems spurious and may reflect bias. Caution is warranted in accepting rTMS as an established treatment for neuropsychiatric conditions.

Journal Article

Share this book

Add to My Shelf

How to Make More Published Research True

by Ioannidis, John P. A. in Analysis , Biomedical research , Efficiency

2014

In a 2005 paper that has been accessed more than a million times, John Ioannidis explained why most published research findings were false. Here he revisits the topic, this time to address how to improve matters. Please see later in the article for the Editors' Summary.

Journal Article

Share this book

Add to My Shelf