Catalogue Search | MBRL

GWAMA: software for genome-wide association meta-analysis

by Mägi, Reedik , Morris, Andrew P in Algorithms , Applications software , Bioinformatics

2010

Background Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. Results We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. Conclusions The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA .

Journal Article

Share this book

Add to My Shelf

Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework

by Cho, Yoonsu , Gaunt, Tom R. , Sanderson, Eleanor in 45/43 , 631/114/2415 , 631/208/205

2020

In Mendelian randomization (MR) analysis, variants that exert horizontal pleiotropy are typically treated as a nuisance. However, they could be valuable in identifying alternative pathways to the traits under investigation. Here, we develop MR-TRYX, a framework that exploits horizontal pleiotropy to discover putative risk factors for disease. We begin by detecting outliers in a single exposure–outcome MR analysis, hypothesising they are due to horizontal pleiotropy. We search across hundreds of complete GWAS summary datasets to systematically identify other (candidate) traits that associate with the outliers. We develop a multi-trait pleiotropy model of the heterogeneity in the exposure–outcome analysis due to pathways through candidate traits. Through detailed investigation of several causal relationships, many pleiotropic pathways are uncovered with already established causal effects, validating the approach, but also alternative putative causal pathways. Adjustment for pleiotropic pathways reduces the heterogeneity across the analyses. In Mendelian randomization (MR) studies, one typically selects SNPs as instrumental variables that do not directly affect the outcome to avoid violation of MR assumptions. Here, Cho et al. present a framework, MR-TRYX, that leverages knowledge of such outliers of horizontal pleiotropy to identify putative causal relationships between exposure and outcome.

Journal Article

Share this book

Add to My Shelf

Mapping of 79 loci for 83 plasma protein biomarkers in cardiovascular disease

by Ziemek, Daniel , Folkersen, Lasse , Tremoli, Elena in Analysis , Biological markers , Biology and Life Sciences

2017

Recent advances in highly multiplexed immunoassays have allowed systematic large-scale measurement of hundreds of plasma proteins in large cohort studies. In combination with genotyping, such studies offer the prospect to 1) identify mechanisms involved with regulation of protein expression in plasma, and 2) determine whether the plasma proteins are likely to be causally implicated in disease. We report here the results of genome-wide association (GWA) studies of 83 proteins considered relevant to cardiovascular disease (CVD), measured in 3,394 individuals with multiple CVD risk factors. We identified 79 genome-wide significant (p<5e-8) association signals, 55 of which replicated at P<0.0007 in separate validation studies (n = 2,639 individuals). Using automated text mining, manual curation, and network-based methods incorporating information on expression quantitative trait loci (eQTL), we propose plausible causal mechanisms for 25 trans-acting loci, including a potential post-translational regulation of stem cell factor by matrix metalloproteinase 9 and receptor-ligand pairs such as RANK-RANK ligand. Using public GWA study data, we further evaluate all 79 loci for their causal effect on coronary artery disease, and highlight several potentially causal associations. Overall, a majority of the plasma proteins studied showed evidence of regulation at the genetic level. Our results enable future studies of the causal architecture of human disease, which in turn should aid discovery of new drug targets.

Journal Article

Share this book

Add to My Shelf

Multi-ethnic genome-wide association study identifies novel locus for type 2 diabetes susceptibility

by Cook, James P , Morris, Andrew P in Aging , Alzheimer's disease , Apolipoprotein E

2016

Genome-wide association studies (GWAS) have traditionally been undertaken in homogeneous populations from the same ancestry group. However, with the increasing availability of GWAS in large-scale multi-ethnic cohorts, we have evaluated a framework for detecting association of genetic variants with complex traits, allowing for population structure, and developed a powerful test of heterogeneity in allelic effects between ancestry groups. We have applied the methodology to identify and characterise loci associated with susceptibility to type 2 diabetes (T2D) using GWAS data from the Resource for Genetic Epidemiology on Adult Health and Aging, a large multi-ethnic population-based cohort, created for investigating the genetic and environmental basis of age-related diseases. We identified a novel locus for T2D susceptibility at genome-wide significance (P<5 × 10(-8)) that maps to TOMM40-APOE, a region previously implicated in lipid metabolism and Alzheimer's disease. We have also confirmed previous reports that single-nucleotide polymorphisms at the TCF7L2 locus demonstrate the greatest extent of heterogeneity in allelic effects between ethnic groups, with the lowest risk observed in populations of East Asian ancestry.

Journal Article

Share this book

Add to My Shelf

Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits

by Weedon, Michael N , Medland, Sarah E , Visscher, Peter M in 631/208/205/2138 , 631/208/2489/144 , 631/208/480

2012

Peter Visscher and colleagues report a new method for approximate conditional and joint association analysis that makes use of summary statistics from meta-analysis of GWAS. They apply this to meta-analysis summary data for height, body mass index and type 2 diabetes. We present an approximate conditional and joint association analysis that can use summary-level statistics from a meta-analysis of genome-wide association studies (GWAS) and estimated linkage disequilibrium (LD) from a reference sample with individual-level genotype data. Using this method, we analyzed meta-analysis summary data from the GIANT Consortium for height and body mass index (BMI), with the LD structure estimated from genotype data in two independent cohorts. We identified 36 loci with multiple associated variants for height (38 leading and 49 additional SNPs, 87 in total) via a genome-wide SNP selection procedure. The 49 new SNPs explain approximately 1.3% of variance, nearly doubling the heritability explained at the 36 loci. We did not find any locus showing multiple associated SNPs for BMI. The method we present is computationally fast and is also applicable to case-control data, which we demonstrate in an example from meta-analysis of type 2 diabetes by the DIAGRAM Consortium.

Journal Article

Share this book

Add to My Shelf

Data quality control in genetic case-control association studies

by Clarke, Geraldine M , Anderson, Carl A , Pettersson, Fredrik H in 631/114/1767 , 631/114/794 , 631/1647/2217/2138

2010

This protocol details the steps for data quality assessment and control that are typically carried out during case-control association studies. The steps described involve the identification and removal of DNA samples and markers that introduce bias. These critical steps are paramount to the success of a case-control study and are necessary before statistically testing for association. We describe how to use PLINK, a tool for handling SNP data, to perform assessments of failure rate per individual and per SNP and to assess the degree of relatedness between individuals. We also detail other quality-control procedures, including the use of SMARTPCA software for the identification of ancestral outliers. These platforms were selected because they are user-friendly, widely used and computationally efficient. Steps needed to detect and establish a disease association using case-control data are not discussed here. Issues concerning study design and marker selection in case-control studies have been discussed in our earlier protocols. This protocol, which is routinely used in our labs, should take approximately 8 h to complete.

Journal Article

Share this book

Add to My Shelf

Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel

by Esko, Tõnu , Mägi, Reedik , Palta, Priit in Accuracy , Bioinformatics , Consortia

2017

Genetic imputation is a cost-efficient way to improve the power and resolution of genome-wide association (GWA) studies. Current publicly accessible imputation reference panels accurately predict genotypes for common variants with minor allele frequency (MAF)≥5% and low-frequency variants (0.5≤MAF<5%) across diverse populations, but the imputation of rare variation (MAF<0.5%) is still rather limited. In the current study, we evaluate imputation accuracy achieved with reference panels from diverse populations with a population-specific high-coverage (30 ×) whole-genome sequencing (WGS) based reference panel, comprising of 2244 Estonian individuals (0.25% of adult Estonians). Although the Estonian-specific panel contains fewer haplotypes and variants, the imputation confidence and accuracy of imputed low-frequency and rare variants was significantly higher. The results indicate the utility of population-specific reference panels for human genetic studies.

Journal Article

Share this book

Add to My Shelf

Analysis of chromatin organization and gene expression in T cells identifies functional genes for rheumatoid arthritis

by Adamson, Antony , Fraser, Peter , Rattray, Magnus in 38/39 , 38/43 , 38/77

2020

Genome-wide association studies have identified genetic variation contributing to complex disease risk. However, assigning causal genes and mechanisms has been more challenging because disease-associated variants are often found in distal regulatory regions with cell-type specific behaviours. Here, we collect ATAC-seq, Hi-C, Capture Hi-C and nuclear RNA-seq data in stimulated CD4+ T cells over 24 h, to identify functional enhancers regulating gene expression. We characterise changes in DNA interaction and activity dynamics that correlate with changes in gene expression, and find that the strongest correlations are observed within 200 kb of promoters. Using rheumatoid arthritis as an example of T cell mediated disease, we demonstrate interactions of expression quantitative trait loci with target genes, and confirm assigned genes or show complex interactions for 20% of disease associated loci, including FOXO1 , which we confirm using CRISPR/Cas9. Although genome-wide association studies have identified genetic variation contributing to disease risk, assigning causal genes is challenging. Here, the authors generate ATAC-seq, Hi-C, Capture Hi-C and RNA-seq data in stimulated CD4+ T cells to identify functional enhancers and demonstrate interactions of expression quantitative trait loci with target genes in rheumatoid arthritis.

Journal Article

Share this book

Add to My Shelf

Leveraging information between multiple population groups and traits improves fine-mapping resolution

by Chikowore, Tinashe , Asimit, Jennifer L. , Soremekun, Opeyemi in 45/43 , 631/114/2415 , 631/208/205/2138

2023

Statistical fine-mapping helps to pinpoint likely causal variants underlying genetic association signals. Its resolution can be improved by (i) leveraging information between traits; and (ii) exploiting differences in linkage disequilibrium structure between diverse population groups. Using association summary statistics, MGflashfm jointly fine-maps signals from multiple traits and population groups; MGfm uses an analogous framework to analyse each trait separately. We also provide a practical approach to fine-mapping with out-of-sample reference panels. In simulation studies we show that MGflashfm and MGfm are well-calibrated and that the mean proportion of causal variants with PP > 0.80 is above 0.75 (MGflashfm) and 0.70 (MGfm). In our analysis of four lipids traits across five population groups, MGflashfm gives a median 99% credible set reduction of 10.5% over MGfm. MGflashfm and MGfm only require summary level data, making them very useful fine-mapping tools in consortia efforts where individual-level data cannot be shared. Statistical fine-mapping helps to pinpoint likely causal variants underlying genetic association signals, and can be enhanced by using multi-ancestry datasets. Here, the authors introduce MGflashfm, a fine-mapping method for pinpointing likely causal variants amongst multiple traits and population groups.

Journal Article

Share this book

Add to My Shelf

Evaluating the Performance of Fine-Mapping Strategies at Common Variant GWAS Loci

by Cortes, Adrian , Brown, Matthew A. , McCarthy, Mark I. in Bayes Theorem , Chromosome Mapping , Computer Simulation

2015

The growing availability of high-quality genomic annotation has increased the potential for mechanistic insights when the specific variants driving common genome-wide association signals are accurately localized. A range of fine-mapping strategies have been advocated, and specific successes reported, but the overall performance of such approaches, in the face of the extensive linkage disequilibrium that characterizes the human genome, is not well understood. Using simulations based on sequence data from the 1000 Genomes Project, we quantify the extent to which fine-mapping, here conducted using an approximate Bayesian approach, can be expected to lead to useful improvements in causal variant localization. We show that resolution is highly variable between loci, and that performance is severely degraded as the statistical power to detect association is reduced. We confirm that, where causal variants are shared between ancestry groups, further improvements in performance can be obtained in a trans-ethnic fine-mapping design. Finally, using empirical data from a recently published genome-wide association study for ankylosing spondylitis, we provide empirical confirmation of the behaviour of the approximate Bayesian approach and demonstrate that seven of twenty-six loci can be fine-mapped to fewer than ten variants.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter