Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
33
result(s) for
"Marcketta, Anthony"
Sort by:
Computationally efficient whole-genome regression for quantitative and binary traits
2021
Genome-wide association analysis of cohorts with thousands of phenotypes is computationally expensive, particularly when accounting for sample relatedness or population structure. Here we present a novel machine-learning method called REGENIE for fitting a whole-genome regression model for quantitative and binary phenotypes that is substantially faster than alternatives in multi-trait analyses while maintaining statistical efficiency. The method naturally accommodates parallel analysis of multiple phenotypes and requires only local segments of the genotype matrix to be loaded in memory, in contrast to existing alternatives, which must load genome-wide matrices into memory. This results in substantial savings in compute time and memory usage. We introduce a fast, approximate Firth logistic regression test for unbalanced case–control phenotypes. The method is ideally suited to take advantage of distributed computing frameworks. We demonstrate the accuracy and computational benefits of this approach using the UK Biobank dataset with up to 407,746 individuals.
REGENIE is a whole-genome regression method based on ridge regression that enables highly parallelized analysis of quantitative and binary traits in biobank-scale data with reduced computational requirements.
Journal Article
Exome sequencing and analysis of 454,787 UK Biobank participants
by
Jones, Marcus
,
Benner, Christian
,
Gurski, Lauren
in
45/23
,
631/208/205/2138
,
631/208/457/649/2219
2021
A major goal in human genetics is to use natural variation to understand the phenotypic consequences of altering each protein-coding gene in the genome. Here we used exome sequencing
1
to explore protein-altering variants and their consequences in 454,787 participants in the UK Biobank study
2
. We identified 12 million coding variants, including around 1 million loss-of-function and around 1.8 million deleterious missense variants. When these were tested for association with 3,994 health-related traits, we found 564 genes with trait associations at
P
≤ 2.18 × 10
−11
. Rare variant associations were enriched in loci from genome-wide association studies (GWAS), but most (91%) were independent of common variant signals. We discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer, among others, as well as risk-lowering associations for hypertension (
SLC9A3R2
), diabetes (
MAP3K15
,
FAM234A
) and asthma (
SLC27A3
). Six genes were associated with brain imaging phenotypes, including two involved in neural development (
GBE1
,
PLD1
). Of the signals available and powered for replication in an independent cohort, 81% were confirmed; furthermore, association signals were generally consistent across individuals of European, Asian and African ancestry. We illustrate the ability of exome sequencing to identify gene–trait associations, elucidate gene function and pinpoint effector genes that underlie GWAS signals at scale.
Whole-exome sequencing analysis of 454,787 individuals in the UK Biobank is used to examine the association of protein-coding variants with nearly 4,000 health-related traits, identifying 564 distinct genes with significant trait associations.
Journal Article
Exome sequencing and characterization of 49,960 individuals in the UK Biobank
2020
The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world
1
. Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, including
PIEZO1
on varicose veins,
COL6A1
on corneal resistance,
MEPE
on bone density, and
IQGAP2
and
GMPR
on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenic
BRCA1
and
BRCA2
variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.
Exome sequences from the first 49,960 participants in the UK Biobank highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.
Journal Article
Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR Study
2016
Precision medicine promises the ability to identify risks and treat patients on the basis of pathogenic genetic variation. Two studies combined exome sequencing results for over 50,000 people with their electronic health records. Dewey
et al.
found that ∼3.5% of individuals in their cohort had clinically actionable genetic variants. Many of these variants affected blood lipid levels that could influence cardiovascular health. Abul-Husn
et al.
extended these findings to investigate the genetics and treatment of familial hypercholesterolemia, a risk factor for cardiovascular disease, within their patient pool. Genetic screening helped identify at-risk patients who could benefit from increased treatment.
Science
, this issue p.
10.1126/science.aaf6814
, p.
10.1126/science.aaf7000
More than 50,000 exomes, coupled with electronic health records, inform on medically relevant genetic variants.
The DiscovEHR collaboration between the Regeneron Genetics Center and Geisinger Health System couples high-throughput sequencing to an integrated health care system using longitudinal electronic health records (EHRs). We sequenced the exomes of 50,726 adult participants in the DiscovEHR study to identify ~4.2 million rare single-nucleotide variants and insertion/deletion events, of which ~176,000 are predicted to result in a loss of gene function. Linking these data to EHR-derived clinical phenotypes, we find clinical associations supporting therapeutic targets, including genes encoding drug targets for lipid lowering, and identify previously unidentified rare alleles associated with lipid levels and other blood level traits. About 3.5% of individuals harbor deleterious variants in 76 clinically actionable genes. The DiscovEHR data set provides a blueprint for large-scale precision medicine initiatives and genomics-guided therapeutic discovery.
Journal Article
Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences
2016
Chris Tyler-Smith, Carlos Bustamante and colleagues report an analysis of 1,244 human Y chromosomes from the 1000 Genomes Project. They find that copy number variants have a higher predicted functional impact than other variant classes and infer bursts of male population expansion corresponding to historical periods of migration and technological innovations.
We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.
Journal Article
Genetic inactivation of ANGPTL4 improves glucose homeostasis and is associated with reduced risk of diabetes
by
Shah, Svati H.
,
Van Hout, Cristopher V.
,
Wardeh, Amr H.
in
45/23
,
631/208/205
,
631/443/319/1642/137
2018
Angiopoietin-like 4 (ANGPTL4) is an endogenous inhibitor of lipoprotein lipase that modulates lipid levels, coronary atherosclerosis risk, and nutrient partitioning. We hypothesize that loss of ANGPTL4 function might improve glucose homeostasis and decrease risk of type 2 diabetes (T2D). We investigate protein-altering variants in
ANGPTL4
among 58,124 participants in the DiscovEHR human genetics study, with follow-up studies in 82,766 T2D cases and 498,761 controls. Carriers of p.E40K, a variant that abolishes ANGPTL4 ability to inhibit lipoprotein lipase, have lower odds of T2D (odds ratio 0.89, 95% confidence interval 0.85–0.92,
p
= 6.3 × 10
−10
), lower fasting glucose, and greater insulin sensitivity. Predicted loss-of-function variants are associated with lower odds of T2D among 32,015 cases and 84,006 controls (odds ratio 0.71, 95% confidence interval 0.49–0.99,
p
= 0.041). Functional studies in
Angptl4
-deficient mice confirm improved insulin sensitivity and glucose homeostasis. In conclusion, genetic inactivation of ANGPTL4 is associated with improved glucose homeostasis and reduced risk of T2D.
Genetic variation in
ANGPTL4
is associated with lipid traits. Here, the authors find that predicted loss-of-function variants in
ANGPTL4
are associated with glucose homeostasis and reduced risk of type 2 diabetes and that
Angptl4
−
/
−
mice on a high-fat diet show improved insulin sensitivity.
Journal Article
High heritability of ascending aortic diameter and trans-ancestry prediction of thoracic aortic disease
2022
Enlargement of the aorta is an important risk factor for aortic aneurysm and dissection, a leading cause of morbidity in the developed world. Here we performed automated extraction of ascending aortic diameter from cardiac magnetic resonance images of 36,021 individuals from the UK Biobank, followed by genome-wide association. We identified lead variants across 41 loci, including genes related to cardiovascular development (
HAND2
,
TBX20
) and Mendelian forms of thoracic aortic disease (
ELN
,
FBN1
). A polygenic score significantly predicted prevalent risk of thoracic aortic aneurysm and the need for surgical intervention for patients with thoracic aneurysm across multiple ancestries within the UK Biobank, FinnGen, the Penn Medicine Biobank and the Million Veterans Program (MVP). Additionally, we highlight the primary causal role of blood pressure in reducing aortic dilation using Mendelian randomization. Overall, our findings provide a roadmap for using genetic determinants of human anatomy to understand cardiovascular development while improving prediction of diseases of the thoracic aorta.
Trans-ancestry genome-wide analyses identify multiple loci associated with ascending aortic diameter. A polygenic score constructed from these loci predicted prevalent risk of thoracic aortic aneurysm in independent populations.
Journal Article
Genetic inactivation of zinc transporter SLC39A5 improves liver function and hyperglycemia in obesogenic settings
by
Van Hout, Cristopher
,
Li, Alexander
,
Shuldiner, Alan
in
Analysis
,
Animals
,
Cation Transport Proteins - genetics
2024
Recent studies have revealed a role for zinc in insulin secretion and glucose homeostasis. Randomized placebo-controlled zinc supplementation trials have demonstrated improved glycemic traits in patients with type II diabetes (T2D). Moreover, rare loss-of-function variants in the zinc efflux transporter
SLC30A8
reduce T2D risk. Despite this accumulated evidence, a mechanistic understanding of how zinc influences systemic glucose homeostasis and consequently T2D risk remains unclear. To further explore the relationship between zinc and metabolic traits, we searched the exome database of the Regeneron Genetics Center-Geisinger Health System DiscovEHR cohort for genes that regulate zinc levels and associate with changes in metabolic traits. We then explored our main finding using in vitro and in vivo models. We identified rare loss-of-function (LOF) variants (MAF <1%) in
Solute Carrier Family 39, Member 5
(
SLC39A5
) associated with increased circulating zinc (p=4.9 × 10
-4
). Trans-ancestry meta-analysis across four studies exhibited a nominal association of
SLC39A5
LOF variants with decreased T2D risk. To explore the mechanisms underlying these associations, we generated mice lacking
Slc39a5. Slc39a5
-/-
mice display improved liver function and reduced hyperglycemia when challenged with congenital or diet-induced obesity. These improvements result from elevated hepatic zinc levels and concomitant activation of hepatic AMPK and AKT signaling, in part due to zinc-mediated inhibition of hepatic protein phosphatase activity. Furthermore, under conditions of diet-induced non-alcoholic steatohepatitis (NASH),
Slc39a5
-/-
mice display significantly attenuated fibrosis and inflammation. Taken together, these results suggest SLC39A5 as a potential therapeutic target for non-alcoholic fatty liver disease (NAFLD) due to metabolic derangements including T2D.
Journal Article
Genome-wide analysis provides genetic evidence that ACE2 influences COVID-19 risk and yields risk scores associated with severe disease
2022
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) enters human host cells via angiotensin-converting enzyme 2 (ACE2) and causes coronavirus disease 2019 (COVID-19). Here, through a genome-wide association study, we identify a variant (rs190509934, minor allele frequency 0.2–2%) that downregulates
ACE2
expression by 37% (
P
= 2.7 × 10
−
8
) and reduces the risk of SARS-CoV-2 infection by 40% (odds ratio = 0.60,
P
= 4.5 × 10
−
13
), providing human genetic evidence that ACE2 expression levels influence COVID-19 risk. We also replicate the associations of six previously reported risk variants, of which four were further associated with worse outcomes in individuals infected with the virus (in/near
LZTFL1
, MHC,
DPP9
and
IFNAR2
). Lastly, we show that common variants define a risk score that is strongly associated with severe disease among cases and modestly improves the prediction of disease severity relative to demographic and clinical factors alone.
Genome-wide meta-analysis of SARS-CoV-2 susceptibility and severity phenotypes in up to 756,646 samples identifies a rare protective variant proximal to
ACE2
. A 6-SNP genetic risk score provides additional predictive power when added to known risk factors.
Journal Article
A deep catalogue of protein-coding variation in 983,578 individuals
2024
Rare coding variants that substantially affect function provide insights into the biology of a gene
1
,
2
–
3
. However, ascertaining the frequency of such variants requires large sample sizes
4
,
5
,
6
,
7
–
8
. Here we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. In total, 23% of the Regeneron Genetics Center Million Exome (RGC-ME) data come from individuals of African, East Asian, Indigenous American, Middle Eastern and South Asian ancestry. The catalogue includes more than 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss of function (LOF), we identify 3,988 LOF-intolerant genes, including 86 that were previously assessed as tolerant and 1,153 that lack established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions that are depleted of missense variants despite being tolerant of pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this resource of coding variation from the RGC-ME dataset publicly accessible through a variant allele frequency browser.
A dataset of coding variation, derived from exome sequencing of nearly one million individuals from a range of ancestries, provides insight into rare variants and could accelerate the discovery of disease-associated genes and advance precision medicine efforts.
Journal Article