Catalogue Search | MBRL

Variance component model to account for sample structure in genome-wide association studies

by Sul, Jae Hoon , Service, Susan K , Zaitlen, Noah A in 631/1647/2217/2138 , Agriculture , Animal Genetics and Genomics

2010

Eleazar Eskin and colleagues report a variance component model for correcting for sample structure in association studies. The EMMAX program is publicly available and may be used for analysis of genome-wide association study datasets. Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure.

Journal Article

Share this book

Add to My Shelf

ForestQC: Quality control on genetic variants from next-generation sequencing data using random forest

by Sul, Jae Hoon , Hwang, Sungoo , Coppola, Giovanni in Algorithms , Biology and Life Sciences , Classification

2019

Next-generation sequencing technology (NGS) enables the discovery of nearly all genetic variants present in a genome. A subset of these variants, however, may have poor sequencing quality due to limitations in NGS or variant callers. In genetic studies that analyze a large number of sequenced individuals, it is critical to detect and remove those variants with poor quality as they may cause spurious findings. In this paper, we present ForestQC, a statistical tool for performing quality control on variants identified from NGS data by combining a traditional filtering approach and a machine learning approach. Our software uses the information on sequencing quality, such as sequencing depth, genotyping quality, and GC contents, to predict whether a particular variant is likely to be false-positive. To evaluate ForestQC, we applied it to two whole-genome sequencing datasets where one dataset consists of related individuals from families while the other consists of unrelated individuals. Results indicate that ForestQC outperforms widely used methods for performing quality control on variants such as VQSR of GATK by considerably improving the quality of variants to be included in the analysis. ForestQC is also very efficient, and hence can be applied to large sequencing datasets. We conclude that combining a machine learning algorithm trained with sequencing quality information and the filtering approach is a practical approach to perform quality control on genetic variants from sequencing data.

Journal Article

Share this book

Add to My Shelf

Genome-wide association studies of metabolites in Finnish men identify disease-relevant loci

by Erdos, Michael R. , Kuusisto, Johanna , Yin, Xianyong in 45/43 , 631/208/205/2138 , 631/208/729/743

2022

Few studies have explored the impact of rare variants (minor allele frequency < 1%) on highly heritable plasma metabolites identified in metabolomic screens. The Finnish population provides an ideal opportunity for such explorations, given the multiple bottlenecks and expansions that have shaped its history, and the enrichment for many otherwise rare alleles that has resulted. Here, we report genetic associations for 1391 plasma metabolites in 6136 men from the late-settlement region of Finland. We identify 303 novel association signals, more than one third at variants rare or enriched in Finns. Many of these signals identify genes not previously implicated in metabolite genome-wide association studies and suggest mechanisms for diseases and disease-related traits. The Finnish population is enriched for genetic variants which are rare in other populations. Here, the authors find new genetic loci associated with 1391 circulating metabolites in 6136 Finnish men, demonstrating that metabolite genetic associations can help elucidate disease mechanisms.

Journal Article

Share this book

Add to My Shelf

Mapping and characterization of structural variation in 17,795 human genomes

by Abel, Haley J. , Chiang, Colby , Zody, Michael C. in 45/23 , 631/208/212 , 631/208/457/649/2157

2020

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline 1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0–11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing. Structural variants in more than 17,000 human genomes are mapped and characterized using whole-genome sequencing, showing how this type of variation contributes to rare deleterious coding and noncoding alleles.

Journal Article

Share this book

Add to My Shelf

ACE2 and TMPRSS2 variation in savanna monkeys (Chlorocebus spp.): Potential risk for zoonotic/anthroponotic transmission of SARS-CoV-2 and a potential model for functional studies

by Jasinska, Anna J. , Bergey, Christina M. , Burt, Felicity in ACE2 , Amino acids , Angiotensin-Converting Enzyme 2

2020

The COVID-19 pandemic, caused by the coronavirus SARS-CoV-2, has devastated health infrastructure around the world. Both ACE2 (an entry receptor) and TMPRSS2 (used by the virus for spike protein priming) are key proteins to SARS-CoV-2 cell entry, enabling progression to COVID-19 in humans. Comparative genomic research into critical ACE2 binding sites, associated with the spike receptor binding domain, has suggested that African and Asian primates may also be susceptible to disease from SARS-CoV-2 infection. Savanna monkeys (Chlorocebus spp.) are a widespread non-human primate with well-established potential as a bi-directional zoonotic/anthroponotic agent due to high levels of human interaction throughout their range in sub-Saharan Africa and the Caribbean. To characterize potential functional variation in savanna monkey ACE2 and TMPRSS2, we inspected recently published genomic data from 245 savanna monkeys, including 163 wild monkeys from Africa and the Caribbean and 82 captive monkeys from the Vervet Research Colony (VRC). We found several missense variants. One missense variant in ACE2 (X:14,077,550; Asp30Gly), common in Ch. sabaeus, causes a change in amino acid residue that has been inferred to reduce binding efficiency of SARS-CoV-2, suggesting potentially reduced susceptibility. The remaining populations appear as susceptible as humans, based on these criteria for receptor usage. All missense variants observed in wild Ch. sabaeus populations are also present in the VRC, along with two splice acceptor variants (at X:14,065,076) not observed in the wild sample that are potentially disruptive to ACE2 function. The presence of these variants in the VRC suggests a promising model for SARS-CoV-2 infection and vaccine and therapy development. In keeping with a One Health approach, characterizing actual susceptibility and potential for bi-directional zoonotic/anthroponotic transfer in savanna monkey populations may be an important consideration for controlling COVID-19 epidemics in communities with frequent human/non-human primate interactions that, in many cases, may have limited health infrastructure.

Journal Article

Share this book

Add to My Shelf

Ancient hybridization and strong adaptation to viruses across African vervet monkey populations

by Wilson, Richard K , Schmitt, Christopher A , Jasinska, Anna J in 45/23 , 631/208/457 , 631/250/255/1901

2017

Analysis of whole-genome sequencing data from 163 vervet monkeys from Africa and the Caribbean shows high diversity among taxa and identifies signatures of selection. Selection signals affect viral processes, and genes that show response to SIV in vervets but not macaques have elevated selection scores. Vervet monkeys are among the most widely distributed nonhuman primates, show considerable phenotypic diversity, and have long been an important biomedical model for a variety of human diseases and in vaccine research. Using whole-genome sequencing data from 163 vervets sampled from across Africa and the Caribbean, we find high diversity within and between taxa and clear evidence that taxonomic divergence was reticulate rather than following a simple branching pattern. A scan for diversifying selection across taxa identifies strong and highly polygenic selection signals affecting viral processes. Furthermore, selection scores are elevated in genes whose human orthologs interact with HIV and in genes that show a response to experimental simian immunodeficiency virus (SIV) infection in vervet monkeys but not in rhesus macaques, suggesting that part of the signal reflects taxon-specific adaptation to SIV.

Journal Article

Share this book

Add to My Shelf

Geographic Patterns of Genome Admixture in Latin American Mestizos

by Bedoya, Gabriel , Parra, Maria V. , Alfaro, Emma L. in African Continental Ancestry Group - genetics , American Native Continental Ancestry Group - genetics , Chromosomes, Human, X - genetics

2008

The large and diverse population of Latin America is potentially a powerful resource for elucidating the genetic basis of complex traits through admixture mapping. However, no genome-wide characterization of admixture across Latin America has yet been attempted. Here, we report an analysis of admixture in thirteen Mestizo populations (i.e. in regions of mainly European and Native settlement) from seven countries in Latin America based on data for 678 autosomal and 29 X-chromosome microsatellites. We found extensive variation in Native American and European ancestry (and generally low levels of African ancestry) among populations and individuals, and evidence that admixture across Latin America has often involved predominantly European men and both Native and African women. An admixture analysis allowing for Native American population subdivision revealed a differentiation of the Native American ancestry amongst Mestizos. This observation is consistent with the genetic structure of pre-Columbian populations and with admixture having involved Natives from the area where the Mestizo examined are located. Our findings agree with available information on the demographic history of Latin America and have a number of implications for the design of association studies in population from the region.

Journal Article

Share this book

Add to My Shelf

Exome sequencing in bipolar disorder identifies AKAP11 as a risk gene shared with schizophrenia

by Clair, David St , Dickerson, Faith , Kahn, Rene S. in 45/23 , 631/208/1515 , 631/208/1516

2022

We report results from the Bipolar Exome (BipEx) collaboration analysis of whole-exome sequencing of 13,933 patients with bipolar disorder (BD) matched with 14,422 controls. We find an excess of ultra-rare protein-truncating variants (PTVs) in patients with BD among genes under strong evolutionary constraint in both major BD subtypes. We find enrichment of ultra-rare PTVs within genes implicated from a recent schizophrenia exome meta-analysis (SCHEMA; 24,248 cases and 97,322 controls) and among binding targets of CHD8. Genes implicated from genome-wide association studies (GWASs) of BD, however, are not significantly enriched for ultra-rare PTVs. Combining gene-level results with SCHEMA, AKAP11 emerges as a definitive risk gene (odds ratio (OR) = 7.06, P = 2.83 × 10 −9 ). At the protein level, AKAP-11 interacts with GSK3B, the hypothesized target of lithium, a primary treatment for BD. Our results lend support to BD’s polygenicity, demonstrating a role for rare coding variation as a significant risk factor in BD etiology. Exome sequencing analysis of 13,933 individuals with bipolar disorder finds enrichment of ultra-rare protein-truncating variants in constrained genes. Combined analysis with schizophrenia exome data identifies AKAP11 as a risk gene for both disorders.

Journal Article

Share this book

Add to My Shelf

Genome-wide association analysis of metabolic traits in a birth cohort from a founder population

by Pouta, Anneli , Coin, Lachlan , Sabatti, Chiara in Adult , Agriculture , Alcohol use

2009

Nelson Freimer and colleagues report the first genome-wide association study of a longitudinal birth cohort (the Northern Finland Birth Cohort 1966). The results include new associations for nine quantitative metabolic traits. Genome-wide association studies (GWAS) of longitudinal birth cohorts enable joint investigation of environmental and genetic influences on complex traits. We report GWAS results for nine quantitative metabolic traits (triglycerides, high-density lipoprotein, low-density lipoprotein, glucose, insulin, C-reactive protein, body mass index, and systolic and diastolic blood pressure) in the Northern Finland Birth Cohort 1966 (NFBC1966), drawn from the most genetically isolated Finnish regions. We replicate most previously reported associations for these traits and identify nine new associations, several of which highlight genes with metabolic functions: high-density lipoprotein with NR1H3 ( LXRA ), low-density lipoprotein with AR and FADS1 - FADS2 , glucose with MTNR1B , and insulin with PANK1 . Two of these new associations emerged after adjustment of results for body mass index. Gene–environment interaction analyses suggested additional associations, which will require validation in larger samples. The currently identified loci, together with quantified environmental exposures, explain little of the trait variation in NFBC1966. The association observed between low-density lipoprotein and an infrequent variant in AR suggests the potential of such a cohort for identifying associations with both common, low-impact and rarer, high-impact quantitative trait loci.

Journal Article

Share this book

Add to My Shelf

Rare loss-of-function variants in SETD1A are associated with schizophrenia and developmental disorders

by Cole, Trevor , Breen, Gerome , Ayub, Muhammad in 45/23 , 45/43 , 631/208/176

2016

The authors analyzed the whole-exome sequences of over 16,000 individuals and found that very rare variants predicted to disrupt the SETD1A gene confer substantial risk for schizophrenia. Damaging variants in SETD1A were also associated with diverse, severe developmental disorders, providing an important genetic link between schizophrenia and other neurodevelopmental disorders. By analyzing the whole-exome sequences of 4,264 schizophrenia cases, 9,343 controls and 1,077 trios, we identified a genome-wide significant association between rare loss-of-function (LoF) variants in SETD1A and risk for schizophrenia ( P = 3.3 × 10 −9 ). We found only two heterozygous LoF variants in 45,376 exomes from individuals without a neuropsychiatric diagnosis, indicating that SETD1A is substantially depleted of LoF variants in the general population. Seven of the ten individuals with schizophrenia carrying SETD1A LoF variants also had learning difficulties. We further identified four SETD1A LoF carriers among 4,281 children with severe developmental disorders and two more carriers in an independent sample of 5,720 Finnish exomes, both with notable neuropsychiatric phenotypes. Together, our observations indicate that LoF variants in SETD1A cause a range of neurodevelopmental disorders, including schizophrenia. Combining these data with previous common variant evidence, we suggest that epigenetic dysregulation, specifically in the histone H3K4 methylation pathway, is an important mechanism in the pathogenesis of schizophrenia.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter