Catalogue Search | MBRL

Advice on better utilization of validation data to adjust odds ratios for differential exposure misclassification (recall bias)

by Burstyn, Igor , Luta, George in exposure misclassification , letter , Letter to the Editor

2025

We were delighted by the publication in your journal of the results of a validation study on self-reported night shift work by Vestergaard et al (1). Such exquisite validation studies that compare self-report to employment records are rare and sorely needed if we are to draw appropriate inferences from epidemiologic studies, both in characterization the degree of risk and – as recently argued by IARC – hazard identification (2). However, we have strong reasons to believe that the validation data which Vestergaard et al obtained could (and should) have been better used to “correct” odds ratios (OR) for differential exposure misclassifications. [NB: Our use of the Excel spreadsheet of Lash et al (3) cited in Vestergaard et al (1) leads to the same “corrected” point estimate but a different, wider, 95% confidence interval (CI) 0.88–1.27. The corrected 95% CI reported in table 3 of (1) is obtained if we use rounded-up counts after adjustment. This is incorrect because expected counts “do not have to be integers”, as stated for the Excel spreadsheet that Vestergaard et al used. This illustrates the importance of the use of tools as intended and the unexpected impact on their results of apparently small changes to the input values for their calculations.] First, we must note that quantitative bias analysis does not correct for exposure misclassification in general. In the case of using fixed values of sensitivities and specificities, it provides a corrected estimate only under the assumption that misclassification probabilities are known with absolute certainty. However, it is obvious from table 2 of Vestergaard et al (1) that misclassification probabilities are estimated with uncertainty. When there is uncertainty about sensitivities and specificities, the textbook they quote recommends (urges!) that probabilistic bias analysis should be carried out to account simultaneously for uncertainty in misclassification probabilities and random sampling errors (3). When this is done, probabilistic bias analysis does not guarantee the correction or adjustment for misclassification of exposure, but it merely produces a collection of alternative estimates via a Monte-Carlo simulation. An alternative adjustment approach for this case of uncertain exposure probabilities, which does involve theoretical assurance of correcting the OR for misclassification of exposure, is a Bayesian methodology (4, 5). Probabilistic bias analysis and Bayesian methods are not guaranteed to produce identical numerical results, and only Bayesian methods produce results that can be interpreted as distributions of true values given data, model and priors (6). Second, it is known to be risky to adjust for exposure misclassification using fixed values of sensitivities and specificities if these are not known exactly (4). Small deviations from true misclassification probabilities can have a dramatic impact on the resulting adjustment. Thus, the corrected OR in Vestergaard et al (1) of 1.05 (95% CI 0.95–1.16) is just one of many such adjusted estimates that is consistent with the presented validation data as we show below. Bayesian methods yet again come to the rescue here because they are designed to account for uncertainty in misclassification parameters by using prior probability distributions. Third, we are puzzled by Vestergaard et al`s choice of using the bootstrap to estimate distributions of sensitivities and specificities when there is a far simpler accepted approach to expressing uncertainty about proportions in quantitative bias analyses (Bayesian or probabilistic). When the validation study estimates a proportion k/N, the uncertainty about the true value of the proportion is typically expressed by using a Beta distribution, defined on [0,1] and is a conjugate prior of the Bernoulli distribution. For an observed proportion k/N, given that before performing the validation study we were completely ignorant about the value of the proportion, the Beta(α,β) distribution that captures this information has shape parameters α=k+1 and β=N-k+1, eg, see (7). We calculated these shape parameters for the misclassification probabilities from table 2 of Vestergaard et al (1) (this is partially reproduced in table 1) and presented them in our table 2, which also shows the corresponding means and variances. Fourth, we observe that the Bayesian adjustment for differential exposure misclassification yields what may be considered as qualitatively different results compared to Vestergaard et al`s adjustment of using fixed values. We followed the implementation from Singer et al (8). The Bayesian approach imposed no correlation between the misclassification parameters. We used a vague prior on the OR, null centered with 95% CI 0.02–50, as recommended for a sparse data problem (9). We also specified a uniform prior (0–1) on the exposure prevalence among controls. The Bayesian model converged and none of its diagnostics appear anomalous; implementation details that center around R (10) packages rjags (11) can be found in the supplementary material (www.sjweh.fi/article/4226) appendix A. Summaries of the posterior distributions are presented in table 3. The posterior OR adjusted for recall bias had a mean of 0.98, median of 0.97 and a credible interval of 0.30–1.71. As an added benefit, we have learned about the distributions of misclassification parameters and true prevalences, which can be used further if one is to update the study in question or use similar exposure assessment tools in a setting where similar exposure misclassification is suspected. Lastly, we carried out our probabilistic bias analysis using the same Beta distributions as in table 2, assuming that the correlation of sensitivities and specificities is weak (ie, 0.1). Details of the implementation of probabilistic bias analysis using the R package episensor (12) are available in supplementary appendix B. The resulting simulated OR had a median of 1.00 and a 95% simulation interval of 0.48–1.31. Thus, Vestergaard et al (1) is an example of a study where using fixed values of misclassification probabilities leads to a rather different estimate of 1.05 (and corresponding 95% CI 0.95–1.16) compared to both probabilistic bias analysis and Bayesian adjustment method that use the same validation data. Distributions of OR obtained after probabilistic and Bayesian adjustments are illustrated in figure 1, which shows that the Bayesian method (in red) favors lower true values of the OR compared to the probabilistic one (in gray). When faced with numerically different results of adjustment for exposure misclassification, we advise our colleagues to rely on the results that arise from the more theoretically justified methodology. In the case of adjustment from Vestergaard et al (1), we think that the Bayesian results are more defensible, yielding an adjusted OR centered around 1.0 (95% credible interval 0.3–1.7). This result appears to us to be a rather more convincing estimate for the association of breast cancer with report of ever having worked night shifts than Vestergaard et al`s “corrected” estimate. We urge epidemiologists who collect precious validation data to collaborate with statisticians who can help them fully utilize it, arriving at more defensible effect estimates and, ultimately, better risk assessments. References 1. Vestergaard JM, Haug JN, Dalbøge A, Bonde JP, Garde AH, Hansen J et al. Validity of self-reported night shift work among women with and without breast cancer. Scand J Work Environ Health 2024 Apr;50(3):152–7. https://doi.org/10.5271/sjweh.4142. 2. IARC. Statistical Methods in Cancer Research Volume V: Bias Assessment in Case–Control and Cohort Studies for Hazard Identification. IARC Scientific Publication No. 171. 1 ed. Lyon, France: International Agency for Research on Cancer; 2024. 3. Lash TL, Fox MP, Fink AK. Applying Quantitative Bias Analysis to Epidemiologic Data: Springer; 2021. 4. Gustafson P, Le ND, Saskin R. Case-control analysis with partial knowledge of exposure misclassification probabilities. Biometrics 2001 Jun;57(2):598–609. https://doi.org/10.1111/j.0006-341X.2001.00598.x. 5. Gustafson P. Measurement Error and Misclassification in Statistics and Epidemiology: Chapman & Hall/CRC Press; 2004. 6. MacLehose RF, Gustafson P. Is probabilistic bias analysis approximately Bayesian? Epidemiology 2012 Jan;23(1):151–8. https://doi.org/10.1097/EDE.0b013e31823b539c. 7. Luta G, Ford MB, Bondy M, Shields PG, Stamey JD. Bayesian sensitivity analysis methods to evaluate bias due to misclassification and missing data using informative priors and external validation data. Cancer Epidemiol 2013 Apr;37(2):121–6. https://doi.org/10.1016/j.canep.2012.11.006. 8. Singer AB, Daniele Fallin M, Burstyn I. Bayesian Correction for Exposure Misclassification and Evolution of Evidence in Two Studies of the Association Between Maternal Occupational Exposure to Asthmagens and Risk of Autism Spectrum Disorder. Curr Environ Health Rep 2018 Sep;5(3):338–50. https://doi.org/10.1007/s40572-018-0205-0. 9. Greenland S, Mansournia MA, Altman DG. Sparse data bias: a problem hiding in plain sight. BMJ 2016 Apr;352:i1981. https://doi.org/10.1136/bmj.i1981. 10. Team RD. A language and environment for statistical computing. ISBN 3-900051-07-0. Vienna, Austria: R Foundation for Statistical Computing; 2006. 11. Plummer M. rjags: Bayesian Graphical Models using MCMC. R package version 4-16 ed2024. 12. Haine D. The episensr package: basic sensitivity analysis of epidemiological results. R package version 1.3.0. 2023 Available from: https://dhaine.github.io/episensr/.

Journal Article

Share this book

Add to My Shelf

Identification of confounder in epidemiologic data contaminated by measurement error in covariates

by Burstyn, Igor , Lee, Paul H. in Algorithms , Analysis , Animals

2016

Background Common methods for confounder identification such as directed acyclic graphs (DAGs), hypothesis testing, or a 10 % change-in-estimate (CIE) criterion for estimated associations may not be applicable due to (a) insufficient knowledge to draw a DAG and (b) when adjustment for a true confounder produces less than 10 % change in observed estimate (e.g. in presence of measurement error). Methods We compare previously proposed simulation-based approach for confounder identification that can be tailored to each specific study and contrast it with commonly applied methods (significance criteria with cutoff levels of p -values of 0.05 or 0.20, and CIE criterion with a cutoff of 10 %), as well as newly proposed two-stage procedure aimed at reduction of false positives (specifically, risk factors that are not confounders). The new procedure first evaluates potential for confounding by examination of correlation of covariates and applies simulated CIE criteria only if there is evidence of correlation, while rejecting a covariate as confounder otherwise. These approaches are compared in simulations studies with binary, continuous, and survival outcomes. We illustrate the application of our proposed confounder identification strategy in examining the association of exposure to mercury in relation to depression in the presence of suspected confounding by fish intake using the National Health and Nutrition Examination Survey (NHANES) 2009–2010 data. Results Our simulations showed that the simulation-determined cutoff was very sensitive to measurement error in exposure and potential confounder. The analysis of NHANES data demonstrated that if the noise-to-signal ratio (error variance in confounder/variance of confounder) is at or below 0.5, roughly 80 % of the simulated analyses adjusting for fish consumption would correctly result in a null association of mercury and depression, and only an extremely poorly measured confounder is not useful to adjust for in this setting. Conclusions No a prior criterion developed for a specific application is guaranteed to be suitable for confounder identification in general. The customization of model-building strategies and study designs through simulations that consider the likely imperfections in the data, as well as finite-sample behavior, would constitute an important improvement on some of the currently prevailing practices in confounder identification and evaluation.

Journal Article

Share this book

Add to My Shelf

Survey of practices of handling exposure measurement errors in modern epidemiology: are the best practices in statistics being adopted by epidemiologists?

by Russell, Anthony James , Maldonado, George , Burstyn, Igor in Anthropometry , Best practices , Bias

2025

Background Measurement errors in epidemiological studies can impact the validity and reliability of findings. Without proper context, inferences (causal or otherwise) based on these findings may be compromised. The consequences of measurement error are well known, but in practice commonly ignored when interpreting findings in epidemiological research. Methods We examined papers published in 2022 in three leading epidemiology journals (International Journal of Epidemiology, Epidemiology, and American Journal of Epidemiology) to assess the occurrence and handling of exposure measurement error (EME). We randomly sampled 64 papers that assessed exposure-outcome relationships. Two authors independently reviewed the selected papers and searched for (a) explicit definition of the exposure in question and how it was measured, (b) an acknowledgment of the possibility of exposure measurement error and (c) statistical investigation of the expected impact or adjustment for EME. Results Our review of recent epidemiological studies reveals encouraging progress on the interpretation and adjustment of EME; however, room of improvement still exists. Among our sample of 64 articles, 2 (3.1%) articles reported exposures for which measurement error did not exist, 3 (4.7%) articles lacked a well-defined research question which precluded proper classification, 8 (12.5%) articles ignored EME, 24 (37.5%) reported on EME or discussed EME as a limitation but treated it as “negligible” without investigating further, 14 (21.9%) articles conducted sensitivity analyses to describe the potential effect EME may have on the studies and 8 (12.5%) articles attempted to quantitatively estimate the impact of EME on the reported risk estimate. Further, 8 (12.5%) articles erroneously claimed that EME would bias risk estimates towards the null (2 of which were also included above). Conclusions Modern epidemiological research shows improved handling and interpretation of EME, while some concerns persist. For instance, the epidemiological literature indicates that it is still resistant to adoption of state-of-the-art methods for managing measurement errors. We recommend that the practice of qualitatively discussing the impact of measurement error in exposure in epidemiology be replaced with modern developments in statistics and comprehensively accounted for.

Journal Article

Share this book

Add to My Shelf

Pride and adversity among nurses and physicians during the pandemic in two US healthcare systems: a mixed methods analysis

by Burstyn, Igor , Holt, Karyn in Anxiety , Coronaviruses , COVID-19

2022

Background Our aims were to examine themes of the most difficult or distressing events reported by healthcare workers during the first wave of COVID-19 pandemic in two US health care systems in order to identify common themes and then to relate them to both behavioral theory and measures of anxiety and depression. Methods We conducted a cross-sectional survey of nurses and physicians during the early phases of the COVID-19 pandemic in the US. An emailed recruitment letter was sent, with about half choosing to supply open-ended responses relevant to thematic analysis. We measured symptoms of anxiety and depression separately, captured demographics, and asked two open-ended questions regarding events that were the most difficult or stressful, and reinforced pride. We reported descriptive statistics and coded thematic categories for their continuum “pride” and “distress” the factors related to fostering well-being according to the Self-Determination Theory. Results Themes that emerged from these narratives were congruent with prediction of Self-Determination theory that autonomy-supportive experiences will foster pride, while autonomy-thwarting experiences will cause distress. Those who reported distressful events were more anxious and depressed compared to those who did not. Among those who reported incidences that reinforced pride in the profession, depression was rarer compared to those who did not. These trends were evident after allowing for medical history and other covariates in logistic regressions. Conclusion Causal claims from our analysis should be made with caution due to the cross-sectional research design. Understanding perceptions of the pandemic by nurses and physicians may help identify and manage sources of distress, and suggest means of mitigating the risk of mental health distress through autonomy-supportive policies.

Journal Article

Share this book

Add to My Shelf

Experiences of coping with the first wave of COVID-19 epidemic in Philadelphia, PA: Mixed methods analysis of a cross-sectional survey of worries and symptoms of mood disorders

by Burstyn, Igor , Huynh, Tran B. in Anxiety , Biology and Life Sciences , Care and treatment

2021

Our objective was to describe how residents of Philadelphia, Pennsylvania, coped psychologically with the first wave of COVID-19 pandemic. In a cross-sectional design, we aimed to estimate the rates and correlates of anxiety and depression, examine how specific worries correlated with general anxiety and depression, and synthesize themes of “the most difficult experiences” shared by the respondents. We collected data through an on-line survey in a convenience sample of 1,293 adult residents of Philadelphia, PA between April 17 and July 3, 2020, inquiring about symptoms of anxiety and depression (via the Hospital Anxiety and Depression Scale), specific worries, open-ended narratives of “the most difficult experiences” (coded into themes), demographics, perceived sources of support, and general health. Anxiety was evident among 30 to 40% of participants and depression—about 10%. Factor analysis revealed two distinct, yet inter-related clusters of specific worries related to mood disorders: concern about “hardships” and “fear of infection”. Regression analyses revealed that anxiety, depression, and fear of infection, but not concern about hardships, worsened over the course of the epidemic. “The most difficult experiences” characterized by loss of income, poor health of self or others, uncertainty, death of a relative or a friend, and struggle accessing food were each associated with some of the measures of worries and mood disorders. Respondents who believed they could rely on support of close personal network fared better psychologically than those who reported relying primarily on government and social services organizations. Thematic analysis revealed complex perceptions of the pandemic by the participants, giving clues to both positive and negative experiences that may have affected how they coped. Despite concerns about external validity, our observations are concordant with emerging evidence of psychological toll of the COVID-19 pandemic and measures employed to mitigate risk of infection.

Journal Article

Share this book

Add to My Shelf

Association of polycyclic aromatic hydrocarbons in moss with blood biomarker among nearby residents in Portland, Oregon

by Geoffrey H. Donovan , Yvonne L. Michael , Sarah Jovan

2022

Polycyclic aromatic hydrocarbons (PAHs) are air pollutants that are costly to measure using traditional air-quality monitoring methods. We used an epiphytic bio-indicator (moss genus: Orthotrichum) to cost-effectively evaluate atmospheric deposition of PAHs in Portland, Oregon in May 2013. However, it is unclear if measurements derived from these bioindicators are good proxies for human exposure. To address this question, we simultaneously, measured PAH-DNA adducts in blood samples of non-smokers residing close to the sites of moss measurements. We accounted for individual determinants of PAH uptake that are not related to environmental air quality through questionnaires, e.g., wood fires, consumption of barbecued and fried meats. Spearman rank correlation and linear regression (to control for confounders from the lifestyle factors) evaluated the associations. We did not observe evidence of an association between PAH levels in moss and PAH-DNA adducts in blood of nearby residents (e.g., all correlations p≥0.5), but higher level of adducts were evident in those who used wood fire in their houses in the last 48 hours. It remains to be determined whether bio-indicators in moss can be used for human health risk assessment.

Journal Article

Share this book

Add to My Shelf

Peering through the mist: systematic review of what the chemistry of contaminants in electronic cigarettes tells us about health risks

by Burstyn, Igor in Aerosols , Aerosols - chemistry , Biostatistics

2014

Background Electronic cigarettes (e-cigarettes) are generally recognized as a safer alternative to combusted tobacco products, but there are conflicting claims about the degree to which these products warrant concern for the health of the vapers (e-cigarette users). This paper reviews available data on chemistry of aerosols and liquids of electronic cigarettes and compares modeled exposure of vapers with occupational safety standards. Methods Both peer-reviewed and “grey” literature were accessed and more than 9,000 observations of highly variable quality were extracted. Comparisons to the most universally recognized workplace exposure standards, Threshold Limit Values (TLVs), were conducted under “worst case” assumptions about both chemical content of aerosol and liquids as well as behavior of vapers. Results There was no evidence of potential for exposures of e-cigarette users to contaminants that are associated with risk to health at a level that would warrant attention if it were an involuntary workplace exposures. The vast majority of predicted exposures are < <1% of TLV. Predicted exposures to acrolein and formaldehyde are typically <5% TLV. Considering exposure to the aerosol as a mixture of contaminants did not indicate that exceeding half of TLV for mixtures was plausible. Only exposures to the declared major ingredients -- propylene glycol and glycerin -- warrant attention because of precautionary nature of TLVs for exposures to hydrocarbons with no established toxicity. Conclusions Current state of knowledge about chemistry of liquids and aerosols associated with electronic cigarettes indicates that there is no evidence that vaping produces inhalable exposures to contaminants of the aerosol that would warrant health concerns by the standards that are used to ensure safety of workplaces. However, the aerosol generated during vaping as a whole (contaminants plus declared ingredients ) creates personal exposures that would justify surveillance of health among exposed persons in conjunction with investigation of means to keep any adverse health effects as low as reasonably achievable. Exposures of bystanders are likely to be orders of magnitude less, and thus pose no apparent concern.

Journal Article

Share this book

Add to My Shelf

Towards reduction in bias in epidemic curves due to outcome misclassification through Bayesian analysis of time-series of laboratory test results: case study of COVID-19 in Alberta, Canada and Philadelphia, USA

by Goldstein, Neal D. , Burstyn, Igor , Gustafson, Paul in Accuracy , Alberta - epidemiology , Analysis

2020

Background Despite widespread use, the accuracy of the diagnostic test for SARS-CoV-2 infection is poorly understood. The aim of our work was to better quantify misclassification errors in identification of true cases of COVID-19 and to study the impact of these errors in epidemic curves using publicly available surveillance data from Alberta, Canada and Philadelphia, USA. Methods We examined time-series data of laboratory tests for SARS-CoV-2 viral infection, the causal agent for COVID-19, to try to explore, using a Bayesian approach, the sensitivity and specificity of the diagnostic test. Results Our analysis revealed that the data were compatible with near-perfect specificity, but it was challenging to gain information about sensitivity. We applied these insights to uncertainty/bias analysis of epidemic curves under the assumptions of both improving and degrading sensitivity. If the sensitivity improved from 60 to 95%, the adjusted epidemic curves likely falls within the 95% confidence intervals of the observed counts. However, bias in the shape and peak of the epidemic curves can be pronounced, if sensitivity either degrades or remains poor in the 60–70% range. In the extreme scenario, hundreds of undiagnosed cases, even among the tested, are possible, potentially leading to further unchecked contagion should these cases not self-isolate. Conclusion The best way to better understand bias in the epidemic curves of COVID-19 due to errors in testing is to empirically evaluate misclassification of diagnosis in clinical settings and apply this knowledge to adjustment of epidemic curves.

Journal Article

Share this book

Add to My Shelf

Parkinson’s disease and occupational exposure to organic solvents in Finland: a nationwide case-control study

by Koskinen, Aki , Sainio, Markku , Burstyn, Igor in Case-Control Studies , case-control study , chlorinated hydrocarbon

2024

OBJECTIVE: This study aimed to investigate the association between Parkinson’s disease (PD) and occupational exposure to organic solvents generally and chlorinated hydrocarbons (CHC) in particular. METHODS: We assembled a Finland-wide case–control study for birth years 1930–1950 by identifying incident PD cases from the register of Reimbursement of Medical Costs and drawing two controls per case using incidence density sampling from the Population Information System, matched on sex, birth year, and residency in Finland in 1980–2014. Occupation and socioeconomic status (SES) were identified from national censuses. We assessed cumulative occupational exposures via FINJEM job-exposure matrix. Smoking was based on occupation-specific prevalence by sex from national surveys. We estimated confounder-adjusted PD incidence rate ratios (IRR) via logistic regression and evaluated their sensitivity to errors in FINJEM through probabilistic bias analysis (PBA). RESULTS: Among ever-employed, we identified 17 187 cases (16.0% potentially exposed to CHC) and 35 738 matched controls. Cases were more likely to not smoke and belong to higher SES. Cumulative exposure (CE) to CHC (per 100 ppm-years, 5-year lag) was associated with adjusted IRR 1.235 (95% confidence interval 0.986–1.547), with stronger associations among women and among persons who had more census records. Sensitivity analyses did not reveal notable associations, but stronger effects were seen in the younger birth cohort (1940–1950). PBA produced notably weaker associations, yielding a median IRR 1.097 (95% simulation interval 0.920–1.291) for CHC. CONCLUSION: Our findings imply that PD is unlikely to be related to typical occupational solvent exposure in Finland, but excess risk cannot be ruled out in some highly exposed occupations.

Journal Article

Share this book

Add to My Shelf

It can be dangerous to take epidemic curves of COVID-19 at face value

by Goldstein, Neal D. , Burstyn, Igor , Gustafson, Paul in Bias , Canada - epidemiology , Clinical Laboratory Techniques - standards

2020

During an epidemic with a new virus, we depend on modelling to plan the response: but how good are the data? The aim of our work was to better understand the impact of misclassification errors in identification of true cases of COVID-19 on epidemic curves. Data originated from Alberta, Canada (available on 28 May 2020). There is presently no information of sensitivity (Sn) and specificity (Sp) of laboratory tests used in Canada for the causal agent for COVID-19. Therefore, we examined best attainable performance in other jurisdictions and similar viruses. This suggested perfect Sp and Sn 60–95%. We used these values to re-calculate epidemic curves to visualize the potential bias due to imperfect testing. If the sensitivity improved, the observed and adjusted epidemic curves likely fall within 95% confidence intervals of the observed counts. However, bias in shape and peak of the epidemic curves can be pronounced, if sensitivity either degrades or remains poor in the 60–70% range. These issues are minor early in the epidemic, but hundreds of undiagnosed cases are likely later on. It is therefore hazardous to judge progress of the epidemic based on observed epidemic curves unless quality of testing is better understood.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter