Catalogue Search | MBRL

A powerful method for pleiotropic analysis under composite null hypothesis identifies novel shared loci between Type 2 Diabetes and Prostate Cancer

by Ray, Debashree , Chatterjee, Nilanjan in Analysis , Biology and Life Sciences , Cardiovascular disease

2020

There is increasing evidence that pleiotropy, the association of multiple traits with the same genetic variants/loci, is a very common phenomenon. Cross-phenotype association tests are often used to jointly analyze multiple traits from a genome-wide association study (GWAS). The underlying methods, however, are often designed to test the global null hypothesis that there is no association of a genetic variant with any of the traits, the rejection of which does not implicate pleiotropy. In this article, we propose a new statistical approach, PLACO, for specifically detecting pleiotropic loci between two traits by considering an underlying composite null hypothesis that a variant is associated with none or only one of the traits. We propose testing the null hypothesis based on the product of the Z-statistics of the genetic variants across two studies and derive a null distribution of the test statistic in the form of a mixture distribution that allows for fractions of variants to be associated with none or only one of the traits. We borrow approaches from the statistical literature on mediation analysis that allow asymptotic approximation of the null distribution avoiding estimation of nuisance parameters related to mixture proportions and variance components. Simulation studies demonstrate that the proposed method can maintain type I error and can achieve major power gain over alternative simpler methods that are typically used for testing pleiotropy. PLACO allows correlation in summary statistics between studies that may arise due to sharing of controls between disease traits. Application of PLACO to publicly available summary data from two large case-control GWAS of Type 2 Diabetes and of Prostate Cancer implicated a number of novel shared genetic regions: 3q23 ( ZBTB38 ), 6q25.3 ( RGS17 ), 9p22.1 ( HAUS6 ), 9p13.3 ( UBAP2 ), 11p11.2 ( RAPSN ), 14q12 ( AKAP6 ), 15q15 ( KNL1 ) and 18q23 ( ZNF236 ).

Journal Article

Share this book

Add to My Shelf

Effect of non-normality and low count variants on cross-phenotype association tests in GWAS

by Debashree, Ray , Chatterjee Nilanjan in Alleles , Amino acids , Diabetes mellitus (non-insulin dependent)

2020

Many complex human diseases, such as type 2 diabetes, are characterized by multiple underlying traits/phenotypes that have substantially shared genetic architecture. Multivariate analysis of correlated traits has the potential to increase the power of detecting underlying common genetic loci. Several cross-phenotype association methods have been proposed—some require individual-level data on traits and genotypes, while the others require only summary-level data. In this article, we explore whether non-normality of multivariate trait distribution affects the inference from some of the existing multi-trait methods and how that effect is dependent on the allele count of the genetic variant being tested. We find that most of these tests are susceptible to biases that lead to spurious association signals. Even after controlling for confounders that may contribute to non-normality and then applying inverse normal transformation on the residuals of each trait, these tests may have inflated type I errors for variants with low minor allele counts (MACs). A likelihood ratio test of association based on the ordinal regression of individual-level genotype conditional on the traits seems to be the least biased and can maintain type I error when the MAC is reasonably large (e.g., MAC > 30). Application of these methods to publicly available summary statistics of eight amino acid traits on European samples seem to exhibit systematic inflation (especially for variants with low MAC), which is consistent with our findings from simulation experiments.

Journal Article

Share this book

Add to My Shelf

Association of Polygenic Score With Tumor Molecular Subtypes in Papillary Thyroid Carcinoma

by Ray, Debashree , Wang, Jennifer R , Zafereo, Mark E in Analysis , Cancer , Carcinoma

2024

Abstract Context Genome-wide association studies have identified germline variants associated with elevated PTC risk. It is also known that somatic driver mutations contribute to PTC development and as such PTCs can be further categorized into different molecular subtypes based on their somatic alterations. However, it remains unknown whether identified germline variants predictive of PTC risk are associated with specific molecular subtypes. Objective The primary goal of the present study is to determine whether germline genetic risk, as assessed using a polygenic score (PGS) is associated with molecular subtypes of papillary thyroid carcinoma (PTC), defined based on tumor driver mutation status. Methods This study was carried out using data from The Cancer Genome Atlas (TCGA) thyroid cancer study. A previously validated 10–single-nucleotide variation PGS for PTC derived from genome-wide association study hits was calculated to ascertain germline genetic risk. The primary molecular subtypes of interest were defined by tumor driver mutation status (BRAFV600E-mutated vs RAS-mutated vs “other”). We also explored associations between PGS and molecular subtypes defined by messenger RNA (mRNA) expression, microRNA expression, and DNA methylation patterns. Polytomous logistic regression analysis was used to assess the association between PGS and PTC molecular subtype with and without adjustment for clinical variables. Odds ratios (ORs) with their 95% CIs were estimated. Results A total of 359 patients were included in the study. PGS was significantly associated specific tumor molecular subtypes defined by tumor driver mutation status. Increasing germline risk was associated with having a higher odd of BRAFV600E-mutated PTC compared to PTCs without driver mutations in the “other” category. No significant difference was detected in terms of PGS tumor categorization in the RAS subtype compared to BRAFV600E. In exploratory analyses, PGS was also associated with mRNA-, microRNA-, and DNA methylation–defined molecular subtypes, as defined by the TCGA PTC study. Conclusion PGS has molecular subtype-specific associations in PTC, which has implications for their use in risk prediction.

Journal Article

Share this book

Add to My Shelf

Incorporating false negative tests in epidemiological models for SARS-CoV-2 transmission and reconciling with seroprevalence estimates

by Salvatore, Maxwell , Bhaduri, Ritwik , Kundu, Ritoban in 631/250/255/2514 , 692/53/2421 , 692/699/255/2514

2021

Susceptible-Exposed-Infected-Removed (SEIR)-type epidemiologic models, modeling unascertained infections latently, can predict unreported cases and deaths assuming perfect testing. We apply a method we developed to account for the high false negative rates of diagnostic RT-PCR tests for detecting an active SARS-CoV-2 infection in a classic SEIR model. The number of unascertained cases and false negatives being unobservable in a real study, population-based serosurveys can help validate model projections. Applying our method to training data from Delhi, India, during March 15–June 30, 2020, we estimate the underreporting factor for cases at 34–53 (deaths: 8–13) on July 10, 2020, largely consistent with the findings of the first round of serosurveys for Delhi (done during June 27–July 10, 2020) with an estimated 22.86% IgG antibody prevalence, yielding estimated underreporting factors of 30–42 for cases. Together, these imply approximately 96–98% cases in Delhi remained unreported (July 10, 2020). Updated calculations using training data during March 15-December 31, 2020 yield estimated underreporting factor for cases at 13–22 (deaths: 3–7) on January 23, 2021, which are again consistent with the latest (fifth) round of serosurveys for Delhi (done during January 15–23, 2021) with an estimated 56.13% IgG antibody prevalence, yielding an estimated range for the underreporting factor for cases at 17–21. Together, these updated estimates imply approximately 92–96% cases in Delhi remained unreported (January 23, 2021). Such model-based estimates, updated with latest data, provide a viable alternative to repeated resource-intensive serosurveys for tracking unreported cases and deaths and gauging the true extent of the pandemic.

Journal Article

Share this book

Add to My Shelf

Genome-wide large-scale multi-trait analysis characterizes global patterns of pleiotropy and unique trait-specific variants

by Dutta, Diptavo , Battle, Alexis , Chatterjee, Nilanjan in 45/43 , 631/114/2415 , 631/208

2024

Genome-wide association studies (GWAS) have found widespread evidence of pleiotropy, but characterization of global patterns of pleiotropy remain highly incomplete due to insufficient power of current approaches. We develop fastASSET, a method that allows efficient detection of variant-level pleiotropic association across many traits. We analyze GWAS summary statistics of 116 complex traits of diverse types collected from the GRASP repository and large GWAS Consortia. We identify 2293 independent loci and find that the lead variants in nearly all these loci (~99%) to be associated with ≥ 2 traits (median = 6). We observe that degree of pleiotropy estimated from our study predicts that observed in the UK Biobank for a much larger number of traits (K = 4114) (correlation = 0.43, p -value < 2.2 × 10 − 16 ). Follow-up analyzes of 21 trait-specific variants indicate their link to the expression in trait-related tissues for a small number of genes involved in relevant biological processes. Our findings provide deeper insight into the nature of pleiotropy and leads to identification of highly trait-specific susceptibility variants. Here, the authors develop fastASSET, a method for efficient detection of variant-level pleiotropic association across many traits. Using this method, they characterize genome-wide pleiotropy and links to genomic features, identifying 21 trait-specific SNPs.

Journal Article

Share this book

Add to My Shelf

A comparison of five epidemiological models for transmission of SARS-CoV-2 in India

by Salvatore, Maxwell , Gu, Xuelin , Bhaduri, Ritwik in Bayes Theorem , Bayesian analysis , Communicable Disease Control - methods

2021

Background Many popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures, lockdowns, and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline curve-fitting model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM). Methods Using COVID-19 case-recovery-death count data reported in India from March 15 to October 15 to train the models, we generate predictions from each of the five models from October 16 to December 31. To compare prediction accuracy with respect to reported cumulative and active case counts and reported cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models. For reported cumulative cases and deaths, we compute Pearson’s and Lin’s correlation coefficients to investigate how well the projected and observed reported counts agree. We also present underreporting factors when available, and comment on uncertainty of projections from each model. Results For active case counts, SMAPE values are 35.14% (SEIR-fansy) and 37.96% (eSIR). For cumulative case counts, SMAPE values are 6.89% (baseline), 6.59% (eSIR), 2.25% (SAPHIRE) and 2.29% (SEIR-fansy). For cumulative death counts, the SMAPE values are 4.74% (SEIR-fansy), 8.94% (eSIR) and 0.77% (ICM). Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) cumulative case counts as well. We compute underreporting factors as of October 31 and note that for cumulative cases, the SEIR-fansy model yields an underreporting factor of 7.25 and ICM model yields 4.54 for the same quantity. For total (sum of reported and unreported) cumulative deaths the SEIR-fansy model reports an underreporting factor of 2.97. On October 31, we observe 8.18 million cumulative reported cases, while the projections (in millions) from the baseline model are 8.71 (95% credible interval: 8.63–8.80), while eSIR yields 8.35 (7.19–9.60), SAPHIRE returns 8.17 (7.90–8.52) and SEIR-fansy projects 8.51 (8.18–8.85) million cases. Cumulative case projections from the eSIR model have the highest uncertainty in terms of width of 95% credible intervals, followed by those from SAPHIRE, the baseline model and finally SEIR-fansy. Conclusions In this comparative paper, we describe five different models used to study the transmission dynamics of the SARS-Cov-2 virus in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. The largest variability across models is observed in predicting the “total” number of infections including reported and unreported cases (on which we have no validation data). The degree of under-reporting has been a major concern in India and is characterized in this report. Overall, the SEIR-fansy model appeared to be a good choice with publicly available R-package and desired flexibility plus accuracy.

Journal Article

Share this book

Add to My Shelf

Exome sequencing of Finnish isolates enhances rare-variant association power

by Chiang, Colby C. , Yin, Xianyong , Abel, Haley J. in 631/208/205 , 631/208/514 , 631/208/729

2019

Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits. Exome-wide association studies for 64 quantitative traits identified 26 newly associated deleterious alleles. Of these 26 alleles, 19 are either unique to or more than 20 times more frequent in Finnish individuals than in other Europeans and show geographical clustering comparable to Mendelian disease mutations that are characteristic of the Finnish population. We estimate that sequencing studies of populations without this unique history would require hundreds of thousands to millions of participants to achieve comparable association power. Exome-wide sequencing studies of populations in Finland identified 26 deleterious alleles associated with 64 quantitative traits that are clinically relevant to cardiovascular and metabolic diseases.

Journal Article

Share this book

Add to My Shelf

The Association Between Thyroid Differentiation Score and Survival Outcomes in Papillary Thyroid Carcinoma

by Xu, Li , Henderson, Ying , Ray, Debashree in Adult , Aged , Analysis

2025

Abstract Context Thyroid differentiation score (TDS), calculated based on mRNA expression levels of 16 genes controlling thyroid metabolism and function, has been proposed as a measure to quantify differentiation in papillary thyroid carcinoma (PTC). Objective The objective of this study is to determine whether TDS is associated with survival outcomes across patient cohorts. Methods Two independent cohorts of patients with PTC were used: (1) The Cancer Genome Atlas (TCGA) thyroid cancer study (N = 372), (2) MD Anderson Cancer Center (MDACC) cohort (N = 111). The primary survival outcome of interest was progression-free interval (PFI). Association with overall survival (OS) was also explored. The Kaplan–Meier method and Cox proportional hazards models were used for survival analyses. Results In both cohorts, TDS was associated with tumor and nodal stage at diagnosis as well as tumor driver mutation status. High TDS was associated with longer PFI on univariable analyses across cohorts. After adjusting for overall stage, TDS remained significantly associated with PFI in the MDACC cohort only (adjusted hazard ratio [aHR] 0.67, 95% CI 0.52-0.85). In subgroup analyses stratified by tumor driver mutation status, higher TDS was most consistently associated with longer PFI in BRAFV600E-mutated tumors in the MDACC cohort after adjusting for overall stage (TCGA: aHR 0.60, 95% CI 0.33-1.07; MDACC: aHR 0.59, 95% CI 0.42-0.82). For OS, increasing TDS was associated with longer OS in the overall MDACC cohort (aHR = 0.78, 95% CI 0.63-0.96), where the median duration of follow-up was 12.9 years. Conclusion TDS quantifies the spectrum of differentiation status in PTC and may serve as a potential prognostic biomarker in PTC, mostly promisingly in BRAFV600E-mutated tumors.

Journal Article

Share this book

Add to My Shelf

Pleiotropy method reveals genetic overlap between orofacial clefts at multiple novel loci from GWAS of multi-ethnic trios

by Leslie, Elizabeth J. , Weinberg, Seth M. , Hetmanski, Jacqueline B. in Biology and Life Sciences , Birth defects , Chromosome 1

2021

Based on epidemiologic and embryologic patterns, nonsyndromic orofacial clefts– the most common craniofacial birth defects in humans– are commonly categorized into cleft lip with or without cleft palate (CL/P) and cleft palate alone (CP), which are traditionally considered to be etiologically distinct. However, some evidence of shared genetic risk in IRF6 , GRHL3 and ARHGAP29 regions exists; only FOXE1 has been recognized as significantly associated with both CL/P and CP in genome-wide association studies (GWAS). We used a new statistical approach, PLACO (pleiotropic analysis under composite null), on a combined multi-ethnic GWAS of 2,771 CL/P and 611 CP case-parent trios. At the genome-wide significance threshold of 5 × 10 −8 , PLACO identified 1 locus in 1q32.2 ( IRF6 ) that appears to increase risk for one OFC subgroup but decrease risk for the other. At a suggestive significance threshold of 10 −6 , we found 5 more loci with compelling candidate genes having opposite effects on CL/P and CP: 1p36.13 ( PAX7 ), 3q29 ( DLG1 ), 4p13 ( LIMCH1 ), 4q21.1 ( SHROOM3 ) and 17q22 ( NOG ). Additionally, we replicated the recognized shared locus 9q22.33 ( FOXE1 ), and identified 2 loci in 19p13.12 ( RAB8A ) and 20q12 ( MAFB ) that appear to influence risk of both CL/P and CP in the same direction. We found locus-specific effects may vary by racial/ethnic group at these regions of genetic overlap, and failed to find evidence of sex-specific differences. We confirmed shared etiology of the two OFC subtypes comprising CL/P, and additionally found suggestive evidence of differences in their pathogenesis at 2 loci of genetic overlap. Our novel findings include 6 new loci of genetic overlap between CL/P and CP; 3 new loci between pairwise OFC subtypes; and 4 loci not previously implicated in OFCs. Our in-silico validation showed PLACO is robust to subtype-specific effects, and can achieve massive power gains over existing approaches for identifying genetic overlap between disease subtypes. In summary, we found suggestive evidence for new genetic regions and confirmed some recognized OFC genes either exerting shared risk or with opposite effects on risk to OFC subtypes.

Journal Article

Share this book

Add to My Shelf

Estimating the wave 1 and wave 2 infection fatality rates from SARS-CoV-2 in India

by Barker, Daniel , Kleinsasser, Michael , Bhaduri, Ritwik in Biomedical and Life Sciences , Biomedicine , Case fatality rate

2021

Objective There has been much discussion and debate around the underreporting of COVID-19 infections and deaths in India. In this short report we first estimate the underreporting factor for infections from publicly available data released by the Indian Council of Medical Research on reported number of cases and national seroprevalence surveys. We then use a compartmental epidemiologic model to estimate the undetected number of infections and deaths, yielding estimates of the corresponding underreporting factors. We compare the serosurvey based ad hoc estimate of the infection fatality rate (IFR) with the model-based estimate. Since the first and second waves in India are intrinsically different in nature, we carry out this exercise in two periods: the first wave (April 1, 2020–January 31, 2021) and part of the second wave (February 1, 2021–May 15, 2021). The latest national seroprevalence estimate is from January 2021, and thus only relevant to our wave 1 calculations. Results Both wave 1 and wave 2 estimates qualitatively show that there is a large degree of “covert infections” in India, with model-based estimated underreporting factor for infections as 11.11 (95% credible interval (CrI) 10.71–11.47) and for deaths as 3.56 (95% CrI 3.48–3.64) for wave 1. For wave 2, underreporting factor for infections escalate to 26.77 (95% CrI 24.26–28.81) and to 5.77 (95% CrI 5.34–6.15) for deaths. If we rely on only reported deaths, the IFR estimate is 0.13% for wave 1 and 0.03% for part of wave 2. Taking underreporting of deaths into account, the IFR estimate is 0.46% for wave 1 and 0.18% for wave 2 (till May 15). Combining waves 1 and 2, as of May 15, while India reported a total of nearly 25 million cases and 270 thousand deaths, the estimated number of infections and deaths stand at 491 million (36% of the population) and 1.21 million respectively, yielding an estimated (combined) infection fatality rate of 0.25%. There is considerable variation in these estimates across Indian states. Up to date seroprevalence studies and mortality data are needed to validate these model-based estimates.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter