Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
26 result(s) for "Jadhav, Bharati"
Sort by:
The GIAB genomic stratifications resource for human reference genomes
Despite the growing variety of sequencing and variant-calling tools, no workflow performs equally well across the entire human genome. Understanding context-dependent performance is critical for enabling researchers, clinicians, and developers to make informed tradeoffs when selecting sequencing hardware and software. Here we describe a set of “stratifications,” which are BED files that define distinct contexts throughout the genome. We define these for GRCh37/38 as well as the new T2T-CHM13 reference, adding many new hard-to-sequence regions which are critical for understanding performance as the field progresses. Specifically, we highlight the increase in hard-to-map and GC-rich stratifications in CHM13 relative to the previous references. We then compare the benchmarking performance with each reference and show the performance penalty brought about by these additional difficult regions in CHM13. Additionally, we demonstrate how the stratifications can track context-specific improvements over different platform iterations, using Oxford Nanopore Technologies as an example. The means to generate these stratifications are available as a snakemake pipeline at https://github.com/usnistgov/giab-stratifications . We anticipate this being useful in enabling precise risk-reward calculations when building sequencing pipelines for any of the commonly-used reference genomes. The GIAB genomic stratification resource defines challenging regions in three commonly used human genome references, including the first complete human genome (CHM13). These help understand strengths and weaknesses of sequencing and analysis methods.
A phenome-wide association study of tandem repeat variation in 168,554 individuals from the UK Biobank
Most genetic association studies focus on binary variants. To identify the effects of multi-allelic variation of tandem repeats (TRs) on human traits, we perform direct TR genotyping and phenome-wide association studies in 168,554 individuals from the UK Biobank, identifying 47 TRs showing fine-mapped associations with 73 traits. We replicate 23 of 31 (74%) of these associations in the All of Us cohort. While this set includes several known repeat expansion disorders, novel associations we found are attributable to common polymorphic variation in TR length rather than rare expansions and include e.g . a coding polyhistidine motif in HRCT1 influencing risk of hypertension and a poly(CGC) in the 5’UTR of GNB2 influencing heart rate. Fine-mapped TRs are strongly enriched for associations with local gene expression and DNA methylation. Our study highlights the contribution of multi-allelic TRs to the “missing heritability” of the human genome. Most genetic association studies focus on bi-allelic single nucleotide variants (SNVs). Here, to investigate the possibility that multi-allelic variation of short tandem repeats influences human traits, the authors perform a phenome-wide association study in the UK Biobank, identifying novel associations missed by traditional SNV-based genome-wide association analyses.
Foxa2 identifies a cardiac progenitor population with ventricular differentiation potential
The recent identification of progenitor populations that contribute to the developing heart in a distinct spatial and temporal manner has fundamentally improved our understanding of cardiac development. However, the mechanisms that direct atrial versus ventricular specification remain largely unknown. Here we report the identification of a progenitor population that gives rise primarily to cardiovascular cells of the ventricles and only to few atrial cells (<5%) of the differentiated heart. These progenitors are specified during gastrulation, when they transiently express Foxa2 , a gene not previously implicated in cardiac development. Importantly, Foxa2+ cells contribute to previously identified progenitor populations in a defined pattern and ratio. Lastly, we describe an analogous Foxa2+ population during differentiation of embryonic stem cells. Together, these findings provide insight into the developmental origin of ventricular and atrial cells, and may lead to the establishment of new strategies for generating chamber-specific cell types from pluripotent stem cells. The progenitor populations that contribute to the key cardiac lineages in a chamber-specific manner are unknown. Here, the authors identify Foxa2 + progenitor population, which is specified at gastrulation, as contributing primarily to cardiovascular cells of both ventricles and the epicardium in mice.
Identification of rare de novo epigenetic variations in congenital disorders
Certain human traits such as neurodevelopmental disorders (NDs) and congenital anomalies (CAs) are believed to be primarily genetic in origin. However, even after whole-genome sequencing (WGS), a substantial fraction of such disorders remain unexplained. We hypothesize that some cases of ND–CA are caused by aberrant DNA methylation leading to dysregulated genome function. Comparing DNA methylation profiles from 489 individuals with ND–CAs against 1534 controls, we identify epivariations as a frequent occurrence in the human genome. De novo epivariations are significantly enriched in cases, while RNAseq analysis shows that epivariations often have an impact on gene expression comparable to loss-of-function mutations. Additionally, we detect and replicate an enrichment of rare sequence mutations overlapping CTCF binding sites close to epivariations, providing a rationale for interpreting non-coding variation. We propose that epivariations contribute to the pathogenesis of some patients with unexplained ND–CAs, and as such likely have diagnostic relevance. A proportion of neurodevelopmental disorder and congenital anomaly cases remain without a genetic diagnosis. Here, the authors study aberrations of DNA methylation in such cases and find that epivariations might provide an explanation for some of these undiagnosed patients.
Rare genetic variation at transcription factor binding sites modulates local DNA methylation profiles
Although DNA methylation is the best characterized epigenetic mark, the mechanism by which it is targeted to specific regions in the genome remains unclear. Recent studies have revealed that local DNA methylation profiles might be dictated by cis- regulatory DNA sequences that mainly operate via DNA-binding factors. Consistent with this finding, we have recently shown that disruption of CTCF-binding sites by rare single nucleotide variants (SNVs) can underlie cis -linked DNA methylation changes in patients with congenital anomalies. These data raise the hypothesis that rare genetic variation at transcription factor binding sites (TFBSs) might contribute to local DNA methylation patterning. In this work, by combining blood genome-wide DNA methylation profiles, whole genome sequencing-derived SNVs from 247 unrelated individuals along with 133 predicted TFBS motifs derived from ENCODE ChIP-Seq data, we observed an association between the disruption of binding sites for multiple TFs by rare SNVs and extreme DNA methylation values at both local and, to a lesser extent, distant CpGs. While the majority of these changes affected only single CpGs, 24% were associated with multiple outlier CpGs within ±1kb of the disrupted TFBS. Interestingly, disruption of functionally constrained sites within TF motifs lead to larger DNA methylation changes at nearby CpG sites. Altogether, these findings suggest that rare SNVs at TFBS negatively influence TF-DNA binding, which can lead to an altered local DNA methylation profile. Furthermore, subsequent integration of DNA methylation and RNA-Seq profiles from cardiac tissues enabled us to observe an association between rare SNV-directed DNA methylation and outlier expression of nearby genes. In conclusion, our findings not only provide insights into the effect of rare genetic variation at TFBS on shaping local DNA methylation and its consequences on genome regulation, but also provide a rationale to incorporate DNA methylation data to interpret the functional role of rare variants.
REViewer: haplotype-resolved visualization of read alignments in and around tandem repeats
Background Expansions of short tandem repeats are the cause of many neurogenetic disorders including familial amyotrophic lateral sclerosis, Huntington disease, and many others. Multiple methods have been recently developed that can identify repeat expansions in whole genome or exome sequencing data. Despite the widely recognized need for visual assessment of variant calls in clinical settings, current computational tools lack the ability to produce such visualizations for repeat expansions. Expanded repeats are difficult to visualize because they correspond to large insertions relative to the reference genome and involve many misaligning and ambiguously aligning reads. Results We implemented REViewer, a computational method for visualization of sequencing data in genomic regions containing long repeat expansions and FlipBook, a companion image viewer designed for manual curation of large collections of REViewer images. To generate a read pileup, REViewer reconstructs local haplotype sequences and distributes reads to these haplotypes in a way that is most consistent with the fragment lengths and evenness of read coverage. To create appropriate training materials for onboarding new users, we performed a concordance study involving 12 scientists involved in short tandem repeat research. We used the results of this study to create a user guide that describes the basic principles of using REViewer as well as a guide to the typical features of read pileups that correspond to low confidence repeat genotype calls. Additionally, we demonstrated that REViewer can be used to annotate clinically relevant repeat interruptions by comparing visual assessment results of 44 FMR1 repeat alleles with the results of triplet repeat primed PCR. For 38 of these alleles, the results of visual assessment were consistent with triplet repeat primed PCR. Conclusions Read pileup plots generated by REViewer offer an intuitive way to visualize sequencing data in regions containing long repeat expansions. Laboratories can use REViewer and FlipBook to assess the quality of repeat genotype calls as well as to visually detect interruptions or other imperfections in the repeat sequence and the surrounding flanking regions. REViewer and FlipBook are available under open-source licenses at https://github.com/illumina/REViewer and https://github.com/broadinstitute/flipbook respectively.
RNA-Seq in 296 phased trios provides a high-resolution map of genomic imprinting
Background Identification of imprinted genes, demonstrating a consistent preference towards the paternal or maternal allelic expression, is important for the understanding of gene expression regulation during embryonic development and of the molecular basis of developmental disorders with a parent-of-origin effect. Combining allelic analysis of RNA-Seq data with phased genotypes in family trios provides a powerful method to detect parent-of-origin biases in gene expression. Results We report findings in 296 family trios from two large studies: 165 lymphoblastoid cell lines from the 1000 Genomes Project and 131 blood samples from the Genome of the Netherlands (GoNL) participants. Based on parental haplotypes, we identified > 2.8 million transcribed heterozygous SNVs phased for parental origin and developed a robust statistical framework for measuring allelic expression. We identified a total of 45 imprinted genes and one imprinted unannotated transcript, including multiple imprinted transcripts showing incomplete parental expression bias that was located adjacent to strongly imprinted genes. For example, PXDC1 , a gene which lies adjacent to the paternally expressed gene FAM50B , shows a 2:1 paternal expression bias. Other imprinted genes had promoter regions that coincide with sites of parentally biased DNA methylation identified in the blood from uniparental disomy (UPD) samples, thus providing independent validation of our results. Using the stranded nature of the RNA-Seq data in lymphoblastoid cell lines, we identified multiple loci with overlapping sense/antisense transcripts, of which one is expressed paternally and the other maternally. Using a sliding window approach, we searched for imprinted expression across the entire genome, identifying a novel imprinted putative lncRNA in 13q21.2. Overall, we identified 7 transcripts showing parental bias in gene expression which were not reported in 4 other recent RNA-Seq studies of imprinting. Conclusions Our methods and data provide a robust and high-resolution map of imprinted gene expression in the human genome.
Genetic variation of tandem repeats as a cause of Alzheimer’s Disease
Background Alzheimer’s Disease (AD) is a common neurodegenerative disorder affecting >35 million people worldwide. Despite extensive genetic studies, the identified factors only explain a small fraction of the heritable risk of AD. This suggests the contribution of yet‐unknown genetic factors to the development of AD, such as tandem repeats (TRs). TRs are stretches of DNA consisting of two or more contiguous copies of a sequence of nucleotides (motif) arranged in a head‐to‐tail pattern (e.g., CAG‐CAG‐CAG). Recent studies indicate that rare expansions of short tandem repeats (STRs, motif size 1‐10 bp) as well as common copy number variation of variable number tandem repeats (VNTRs, motif size >10 bp) can act as risk modifiers for AD. However, despite this evidence, there has been no comprehensive evaluation of TR variation in large cohorts of AD patients. Therefore, we hypothesize that variation in TR sequences can contribute to the development of AD in a fraction of cases. Method To uncover the link between TR variation and AD, we are performing a comprehensive profiling of TR variation using whole genome sequencing (WGS) data generated by the UK Biobank (https://www.ukbiobank.ac.uk/). This involves estimating copy numbers in ∼220,000 polymorphic loci, including 77,000 STRs and 150,000 VNTRs, across ∼349,000 genomes (∼9,000 cases and ∼340,000 unrelated controls) using novel bioinformatic approaches, followed by extensive quality control to ensure the reliability of our estimates. Subsequently, we will use the resulting high‐confident estimates to identify TR alleles that exhibit significantly different copy numbers in AD patients compared to controls. Result To date, we have successfully designed efficient pipelines for profiling both STRs and VNTRs using WGS on a large scale to be executed on the UK Biobank cloud‐based environment (https://ukbiobank.dnanexus.com). Using these pipelines, we have generated copy number estimates in approximately 200,000 UK Biobank samples, including ∼1,200 AD cases. Conclusion By leveraging the power of large cohorts and innovative methods, our study aims to explore the contribution of one of the most abundant and polymorphic, and yet understudied, genetic elements in the genome to AD, expanding our understanding of the genetic risk factors and mechanisms underlying this condition.
Basic Science and Pathogenesis
Alzheimer's Disease (AD) is a common neurodegenerative disorder affecting >35 million people worldwide. Despite extensive genetic studies, the identified factors only explain a small fraction of the heritable risk of AD. This suggests the contribution of yet-unknown genetic factors to the development of AD, such as tandem repeats (TRs). TRs are stretches of DNA consisting of two or more contiguous copies of a sequence of nucleotides (motif) arranged in a head-to-tail pattern (e.g., CAG-CAG-CAG). Recent studies indicate that rare expansions of short tandem repeats (STRs, motif size 1-10 bp) as well as common copy number variation of variable number tandem repeats (VNTRs, motif size >10 bp) can act as risk modifiers for AD. However, despite this evidence, there has been no comprehensive evaluation of TR variation in large cohorts of AD patients. Therefore, we hypothesize that variation in TR sequences can contribute to the development of AD in a fraction of cases. To uncover the link between TR variation and AD, we are performing a comprehensive profiling of TR variation using whole genome sequencing (WGS) data generated by the UK Biobank (https://www.ukbiobank.ac.uk/). This involves estimating copy numbers in ∼220,000 polymorphic loci, including 77,000 STRs and 150,000 VNTRs, across ∼349,000 genomes (∼9,000 cases and ∼340,000 unrelated controls) using novel bioinformatic approaches, followed by extensive quality control to ensure the reliability of our estimates. Subsequently, we will use the resulting high-confident estimates to identify TR alleles that exhibit significantly different copy numbers in AD patients compared to controls. To date, we have successfully designed efficient pipelines for profiling both STRs and VNTRs using WGS on a large scale to be executed on the UK Biobank cloud-based environment (https://ukbiobank.dnanexus.com). Using these pipelines, we have generated copy number estimates in approximately 200,000 UK Biobank samples, including ∼1,200 AD cases. By leveraging the power of large cohorts and innovative methods, our study aims to explore the contribution of one of the most abundant and polymorphic, and yet understudied, genetic elements in the genome to AD, expanding our understanding of the genetic risk factors and mechanisms underlying this condition.
Increased frequency of repeat expansion mutations across different populations
Repeat expansion disorders (REDs) are a devastating group of predominantly neurological diseases. Together they are common, affecting 1 in 3,000 people worldwide with population-specific differences. However, prevalence estimates of REDs are hampered by heterogeneous clinical presentation, variable geographic distributions and technological limitations leading to underascertainment. Here, leveraging whole-genome sequencing data from 82,176 individuals from different populations, we found an overall disease allele frequency of REDs of 1 in 283 individuals. Modeling disease prevalence using genetic data, age at onset and survival, we show that the expected number of people with REDs would be two to three times higher than currently reported figures, indicating underdiagnosis and/or incomplete penetrance. While some REDs are population specific, for example, Huntington disease-like 2 in Africans, most REDs are represented in all broad genetic ancestries (that is, Europeans, Africans, Americans, East Asians and South Asians), challenging the notion that some REDs are found only in specific populations. These results have worldwide implications for local and global health communities in the diagnosis and counseling of REDs. Repeat expansion disorders are found in all major global populations, and their frequency is up to three times higher than typically quoted, suggesting underdiagnosis or incomplete penetrance.