Catalogue Search | MBRL

mystery of missing heritability: Genetic interactions create phantom heritability

by Hechter, Eliana , Sunyaev, Shamil R , Lander, Eric S in Biological Sciences , Crohn disease , Crohn's disease

2012

Human genetics has been haunted by the mystery of \"missing heritability\" of common traits. Although studies have discovered >1,200 variants associated with common diseases and traits, these variants typically appear to explain only a minority of the heritability. The proportion of heritability explained by a set of variants is the ratio of (i) the heritability due to these variants (numerator), estimated directly from their observed effects, to (ii) the total heritability (denominator), inferred indirectly from population data. The prevailing view has been that the explanation for missing heritability lies in the numerator—that is, in as-yet undiscovered variants. While many variants surely remain to be found, we show here that a substantial portion of missing heritability could arise from overestimation of the denominator, creating \"phantom heritability.\" Specifically, (i) estimates of total heritability implicitly assume the trait involves no genetic interactions (epistasis) among loci; (ii) this assumption is not justified, because models with interactions are also consistent with observable data; and (iii) under such models, the total heritability may be much smaller and thus the proportion of heritability explained much larger. For example, 80% of the currently missing heritability for Crohn's disease could be due to genetic interactions, if the disease involves interaction among three pathways. In short, missing heritability need not directly correspond to missing variants, because current estimates of total heritability may be significantly inflated by genetic interactions. Finally, we describe a method for estimating heritability from isolated populations that is not inflated by genetic interactions.

Journal Article

Share this book

Add to My Shelf

Identification of cancer driver genes based on nucleotide context

by Weghorn, Donate , Van Allen, Eliezer M. , Sunyaev, Shamil R. in 631/114/794 , 631/61/212/2166 , 631/67/395

2020

Cancer genomes contain large numbers of somatic mutations but few of these mutations drive tumor development. Current approaches either identify driver genes on the basis of mutational recurrence or approximate the functional consequences of nonsynonymous mutations by using bioinformatic scores. Passenger mutations are enriched in characteristic nucleotide contexts, whereas driver mutations occur in functional positions, which are not necessarily surrounded by a particular nucleotide context. We observed that mutations in contexts that deviate from the characteristic contexts around passenger mutations provide a signal in favor of driver genes. We therefore developed a method that combines this feature with the signals traditionally used for driver-gene identification. We applied our method to whole-exome sequencing data from 11,873 tumor–normal pairs and identified 460 driver genes that clustered into 21 cancer-related pathways. Our study provides a resource of driver genes across 28 tumor types with additional driver genes identified according to mutations in unusual nucleotide contexts. MutPanning is a new method to detect cancer driver genes that identifies genes with an excess of mutations in unusual nucleotide contexts. Applying this to whole-exome sequencing data from 11,873 tumor–normal pairs identifies 460 driver genes.

Journal Article

Share this book

Add to My Shelf

The missing link between genetic association and regulatory function

by Chun, Sung , Lee, Daniel , Sunyaev, Shamil R in Alleles , colocalization , Diabetes

2022

The genetic basis of most traits is highly polygenic and dominated by non-coding alleles. It is widely assumed that such alleles exert small regulatory effects on the expression of cis -linked genes. However, despite the availability of gene expression and epigenomic datasets, few variant-to-gene links have emerged. It is unclear whether these sparse results are due to limitations in available data and methods, or to deficiencies in the underlying assumed model. To better distinguish between these possibilities, we identified 220 gene–trait pairs in which protein-coding variants influence a complex trait or its Mendelian cognate. Despite the presence of expression quantitative trait loci near most GWAS associations, by applying a gene-based approach we found limited evidence that the baseline expression of trait-related genes explains GWAS associations, whether using colocalization methods (8% of genes implicated), transcription-wide association (2% of genes implicated), or a combination of regulatory annotations and distance (4% of genes implicated). These results contradict the hypothesis that most complex trait-associated variants coincide with homeostatic expression QTLs, suggesting that better models are needed. The field must confront this deficit and pursue this ‘missing regulation.’

Journal Article

Share this book

Add to My Shelf

Cell-of-origin chromatin organization shapes the mutational landscape of cancer

by Polak, Paz , Stamatoyannopoulos, John A. , Sunyaev, Shamil R. in 631/67/69 , Accuracy , Cancer

2015

An analysis of cell-type-specific epigenomic features reveals a relationship between epigenomic and mutational profiles; chromatin characteristics can explain a large proportion of mutational variance in cancer genomes and the mutational distribution can identify the probable cell type from which a given cancer originated from. Chromatin organization in cancerous cells Genomic studies have shown that different cancer types vary substantially in the local density and types of somatic mutations. This has been explained not only by differences in DNA sequence but also by other features including epigenetic organization. Shamil Sunyaev and colleague now compare mutation densities to detailed epigenetic profiles of different cell types and tissues. They demonstrate that epigenomic features of a given cell type or tissue in which a cancer arises are much stronger determinants of mutational profiles than other properties. Conversely, the findings make it possible to deduce information on the possible tissue-of-origin of a tumour based on its mutational landscape. Cancer is a disease potentiated by mutations in somatic cells. Cancer mutations are not distributed uniformly along the human genome. Instead, different human genomic regions vary by up to fivefold in the local density of cancer somatic mutations 1 , posing a fundamental problem for statistical methods used in cancer genomics. Epigenomic organization has been proposed as a major determinant of the cancer mutational landscape 1 , 2 , 3 , 4 , 5 . However, both somatic mutagenesis and epigenomic features are highly cell-type-specific 6 , 7 . We investigated the distribution of mutations in multiple independent samples of diverse cancer types and compared them to cell-type-specific epigenomic features. Here we show that chromatin accessibility and modification, together with replication timing, explain up to 86% of the variance in mutation rates along cancer genomes. The best predictors of local somatic mutation density are epigenomic features derived from the most likely cell type of origin of the corresponding malignancy. Moreover, we find that cell-of-origin chromatin features are much stronger determinants of cancer mutation profiles than chromatin features of matched cancer cell lines. Furthermore, we show that the cell type of origin of a cancer can be accurately determined based on the distribution of mutations along its genome. Thus, the DNA sequence of a cancer genome encompasses a wealth of information about the identity and epigenomic features of its cell of origin.

Journal Article

Share this book

Add to My Shelf

Searching for missing heritability: Designing rare variant association studies

by Eric S. Lander , Eliana Hechter , Benjamin M. Neale in alleles , Analysis , Biological Sciences

2014

Genetic studies have revealed thousands of loci predisposing to hundreds of human diseases and traits, revealing important biological pathways and defining novel therapeutic hypotheses. However, the genes discovered to date typically explain less than half of the apparent heritability. Because efforts have largely focused on common genetic variants, one hypothesis is that much of the missing heritability is due to rare genetic variants. Studies of common variants are typically referred to as genomewide association studies, whereas studies of rare variants are often simply called sequencing studies. Because they are actually closely related, we use the terms common variant association study (CVAS) and rare variant association study (RVAS). In this paper, we outline the similarities and differences between RVAS and CVAS and describe a conceptual framework for the design of RVAS. We apply the framework to address key questions about the sample sizes needed to detect association, the relative merits of testing disruptive alleles vs. missense alleles, frequency thresholds for filtering alleles, the value of predictors of the functional impact of missense alleles, the potential utility of isolated populations, the value of gene-set analysis, and the utility of de novo mutations. The optimal design depends critically on the selection coefficient against deleterious alleles and thus varies across genes. The analysis shows that common variant and rare variant studies require similarly large sample collections. In particular, a well-powered RVAS should involve discovery sets with at least 25,000 cases, together with a substantial replication set.

Journal Article

Share this book

Add to My Shelf

Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types

by Croteau-Chonka, Damien C , Chun, Sung , Sunyaev, Shamil R in 45/43 , 631/208/177 , 631/208/200

2017

Shamil Sunyaev, Chris Cotsapas and colleagues present a joint likelihood framework for determining the statistical evidence of shared genetic effects of overlapping disease-associated loci and expression quantitative trait loci (eQTLs). They find evidence for shared genetic effects at 25% of eQTL–autoimmune disease locus pairs. Most autoimmune-disease-risk effects identified by genome-wide association studies (GWAS) localize to open chromatin with gene-regulatory activity. GWAS loci are also enriched in expression quantitative trait loci (eQTLs), thus suggesting that most risk variants alter gene expression 1 , 2 . However, because causal variants are difficult to identify, and cis -eQTLs occur frequently, it remains challenging to identify specific instances of disease-relevant changes to gene regulation. Here, we used a novel joint likelihood framework with higher resolution than that of previous methods to identify loci where autoimmune-disease risk and an eQTL are driven by a single shared genetic effect. Using eQTLs from three major immune subpopulations, we found shared effects in only ∼25% of the loci examined. Thus, we show that a fraction of gene-regulatory changes suggest strong mechanistic hypotheses for disease risk, but we conclude that most risk mechanisms are not likely to involve changes in basal gene expression.

Journal Article

Share this book

Add to My Shelf

Genomic variation landscape of the human gut microbiome

by Bork, Peer , Kultima, Jens Roat , Mende, Daniel R. in 631/114/212/2142 , 631/208/726/649 , 631/326/2565/2142

2013

Whereas large-scale efforts have rapidly advanced the understanding and practical impact of human genomic variation, the practical impact of variation is largely unexplored in the human microbiome. We therefore developed a framework for metagenomic variation analysis and applied it to 252 faecal metagenomes of 207 individuals from Europe and North America. Using 7.4 billion reads aligned to 101 reference species, we detected 10.3 million single nucleotide polymorphisms (SNPs), 107,991 short insertions/deletions, and 1,051 structural variants. The average ratio of non-synonymous to synonymous polymorphism rates of 0.11 was more variable between gut microbial species than across human hosts. Subjects sampled at varying time intervals exhibited individuality and temporal stability of SNP variation patterns, despite considerable composition changes of their gut microbiota. This indicates that individual-specific strains are not easily replaced and that an individual might have a unique metagenomic genotype, which may be exploitable for personalized diet or drug intake. A framework for metagenomic variation analysis to explore variation in the human microbiome is developed; the study describes SNPs, short indels and structural variants in 252 faecal metagenomes of 207 individuals from Europe and North America. Gene variation in human gut microbes A collaboration between members of the European MetaHIT and American NIH Human Microbiome projects has led to the development of a framework for metagenomic variation analysis, which is used to analyse single nucleotide polymorphisms, short indels and structural variants in 252 faecal metagenomes of 207 individuals from Europe and North America. Variation patterns suggest that individuals might have unique metagenomic genotypes that could provide data relevant to personalized dietary or drug choices.

Journal Article

Share this book

Add to My Shelf

Systematic Localization of Common Disease-Associated Variation in Regulatory DNA

by Kaul, Rajinder , Neri, Fidencio , Hansen, R. Scott in Alleles , Binding sites , Biological and medical sciences

2012

Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure—related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn's disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.

Journal Article

Share this book

Add to My Shelf

Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies

by Sohail, Mashaal , Sunyaev, Shamil R , Turchin, Michael C in Adaptation, Biological , Biostatistics , Body Height

2019

Genetic predictions of height differ among human populations and these differences have been interpreted as evidence of polygenic adaptation. These differences were first detected using SNPs genome-wide significantly associated with height, and shown to grow stronger when large numbers of sub-significant SNPs were included, leading to excitement about the prospect of analyzing large fractions of the genome to detect polygenic adaptation for multiple traits. Previous studies of height have been based on SNP effect size measurements in the GIANT Consortium meta-analysis. Here we repeat the analyses in the UK Biobank, a much more homogeneously designed study. We show that polygenic adaptation signals based on large numbers of SNPs below genome-wide significance are extremely sensitive to biases due to uncorrected population stratification. More generally, our results imply that typical constructions of polygenic scores are sensitive to population stratification and that population-level differences should be interpreted with caution. Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter ).

Journal Article

Share this book

Add to My Shelf

Population-specific causal disease effect sizes in functionally important regions impacted by selection

by Price, Alkes L. , Koch, Evan M. , Luo, Yang in 631/208/1516 , 631/208/182 , 631/208/205/2138

2021

Many diseases exhibit population-specific causal effect sizes with trans-ethnic genetic correlations significantly less than 1, limiting trans-ethnic polygenic risk prediction. We develop a new method, S-LDXR, for stratifying squared trans-ethnic genetic correlation across genomic annotations, and apply S-LDXR to genome-wide summary statistics for 31 diseases and complex traits in East Asians (average N = 90K) and Europeans (average N = 267K) with an average trans-ethnic genetic correlation of 0.85. We determine that squared trans-ethnic genetic correlation is 0.82× (s.e. 0.01) depleted in the top quintile of background selection statistic, implying more population-specific causal effect sizes. Accordingly, causal effect sizes are more population-specific in functionally important regions, including conserved and regulatory regions. In regions surrounding specifically expressed genes, causal effect sizes are most population-specific for skin and immune genes, and least population-specific for brain genes. Our results could potentially be explained by stronger gene-environment interaction at loci impacted by selection, particularly positive selection. Trans-ethnic genetic correlation is significantly less than 1 for many diseases. Here, the authors stratify this correlation by genomic annotations, finding that loci whose causal disease effect sizes differ between ethnicities are likely impacted by selection, particularly positive selection.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter