Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
217
result(s) for
"Yu, Fuli"
Sort by:
Clinical utility of noninvasive prenatal screening for expanded chromosome disease syndromes
2019
To assess the clinical performance of an expanded noninvasive prenatal screening (NIPS) test (“NIPS-Plus”) for detection of both aneuploidy and genome-wide microdeletion/microduplication syndromes (MMS).
A total of 94,085 women with a singleton pregnancy were prospectively enrolled in the study. The cell-free plasma DNA was directly sequenced without intermediate amplification and fetal abnormalities identified using an improved copy-number variation (CNV) calling algorithm.
A total of 1128 pregnancies (1.2%) were scored positive for clinically significant fetal chromosome abnormalities. This comprised 965 aneuploidies (1.026%) and 163 (0.174%) MMS. From follow-up tests, the positive predictive values (PPVs) for T21, T18, T13, rare trisomies, and sex chromosome aneuploidies were calculated as 95%, 82%, 46%, 29%, and 47%, respectively. For known MMS (n=32), PPVs were 93% (DiGeorge), 68% (22q11.22 microduplication), 75% (Prader–Willi/Angleman), and 50% (Cri du Chat). For the remaining genome-wide MMS (n=88), combined PPVs were 32% (CNVs ≥10Mb) and 19% (CNVs <10Mb).
NIPS-Plus yielded high PPVs for common aneuploidies and DiGeorge syndrome, and moderate PPVs for other MMS. Our results present compelling evidence that NIPS-Plus can be used as a first-tier pregnancy screening method to improve detection rates of clinically significant fetal chromosome abnormalities.
Journal Article
An integrative variant analysis suite for whole exome next-generation sequencing data
2012
Background
Whole exome capture sequencing allows researchers to cost-effectively sequence the coding regions of the genome. Although the exome capture sequencing methods have become routine and well established, there is currently a lack of tools specialized for variant calling in this type of data.
Results
Using statistical models trained on validated whole-exome capture sequencing data, the Atlas2 Suite is an integrative variant analysis pipeline optimized for variant discovery on all three of the widely used next generation sequencing platforms (SOLiD, Illumina, and Roche 454). The suite employs logistic regression models in conjunction with user-adjustable cutoffs to accurately separate true SNPs and INDELs from sequencing and mapping errors with high sensitivity (96.7%).
Conclusion
We have implemented the Atlas2 Suite and applied it to 92 whole exome samples from the 1000 Genomes Project. The Atlas2 Suite is available for download at
http://sourceforge.net/projects/atlas2/
. In addition to a command line version, the suite has been integrated into the Genboree Workbench, allowing biomedical scientists with minimal informatics expertise to remotely call, view, and further analyze variants through a simple web interface. The existing genomic databases displayed via the Genboree browser also streamline the process from variant discovery to functional genomics analysis, resulting in an off-the-shelf toolkit for the broader community.
Journal Article
Allele-specific epigenome maps reveal sequence-dependent stochastic switching at regulatory loci
2018
Genome-wide epigenetic marks regulate gene expression, but the amount and function of variability in these marks are poorly understood. Working with human-derived samples, Onuchic et al. examined disease-associated genetic variation and sequence-dependent allele-specific methylation at gene regulatory loci. Regulatory sequences within individual chromosomal DNA molecules showed full or no methylation at specific sites corresponding to “on” and “off” switches. Interestingly, methylation did not occur on each DNA molecule, resulting in a variable fraction of methylated chromosomes. This stochastic type of gene regulation was more common for rare genetic variants, which may suggest a role in human disease. Science , this issue p. eaar3146 Genome-wide analyses of epigenetic markers in human cells identify allele-specific functions that affect gene expression in health and disease. To assess the impact of genetic variation in regulatory loci on human health, we constructed a high-resolution map of allelic imbalances in DNA methylation, histone marks, and gene transcription in 71 epigenomes from 36 distinct cell and tissue types from 13 donors. Deep whole-genome bisulfite sequencing of 49 methylomes revealed sequence-dependent CpG methylation imbalances at thousands of heterozygous regulatory loci. Such loci are enriched for stochastic switching, which is defined as random transitions between fully methylated and unmethylated states of DNA. The methylation imbalances at thousands of loci are explainable by different relative frequencies of the methylated and unmethylated states for the two alleles. Further analyses provided a unifying model that links sequence-dependent allelic imbalances of the epigenome, stochastic switching at gene regulatory loci, and disease-associated genetic variation.
Journal Article
Mutations in ASH1L confer susceptibility to Tourette syndrome
2020
Tourette syndrome (TS) is a childhood-onset neuropsychiatric disorder characterized by repetitive motor movements and vocal tics. The clinical manifestations of TS are complex and often overlap with other neuropsychiatric disorders. TS is highly heritable; however, the underlying genetic basis and molecular and neuronal mechanisms of TS remain largely unknown. We performed whole-exome sequencing of a hundred trios (probands and their parents) with detailed records of their clinical presentations and identified a risk gene, ASH1L, that was both de novo mutated and associated with TS based on a transmission disequilibrium test. As a replication, we performed follow-up targeted sequencing of ASH1L in additional 524 unrelated TS samples and replicated the association (P value = 0.001). The point mutations in ASH1L cause defects in its enzymatic activity. Therefore, we established a transgenic mouse line and performed an array of anatomical, behavioral, and functional assays to investigate ASH1L function. The Ash1l+/− mice manifested tic-like behaviors and compulsive behaviors that could be rescued by the tic-relieving drug haloperidol. We also found that Ash1l disruption leads to hyper-activation and elevated dopamine-releasing events in the dorsal striatum, all of which could explain the neural mechanisms for the behavioral abnormalities in mice. Taken together, our results provide compelling evidence that ASH1L is a TS risk gene.
Journal Article
Reduced meiotic recombination in rhesus macaques and the origin of the human recombination landscape
by
Liu, Xiaoming
,
Harris, R. Alan
,
Xue, Cheng
in
Animal populations
,
Animal research models
,
Apes
2020
Characterizing meiotic recombination rates across the genomes of nonhuman primates is important for understanding the genetics of primate populations, performing genetic analyses of phenotypic variation and reconstructing the evolution of human recombination. Rhesus macaques (Macaca mulatta) are the most widely used nonhuman primates in biomedical research. We constructed a high-resolution genetic map of the rhesus genome based on whole genome sequence data from Indian-origin rhesus macaques. The genetic markers used were approximately 18 million SNPs, with marker density 6.93 per kb across the autosomes. We report that the genome-wide recombination rate in rhesus macaques is significantly lower than rates observed in apes or humans, while the distribution of recombination across the macaque genome is more uniform. These observations provide new comparative information regarding the evolution of recombination in primates.
Journal Article
PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations
by
English, Adam C
,
Han, Yi
,
Boerwinkle, Eric
in
Animal Genetics and Genomics
,
Biomedical and Life Sciences
,
Chromosome Aberrations
2015
Background
Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high.
Results
We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki–Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants.
Conclusions
The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric
Alu
elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.
Journal Article
Extremely low-coverage whole genome sequencing in South Asians captures population genomics information
by
Zhou, Anbo
,
Xing, Jinchuan
,
Gibbs, Richard A.
in
Animal Genetics and Genomics
,
Asian people
,
Asian People - genetics
2017
Background
The cost of Whole Genome Sequencing (WGS) has decreased tremendously in recent years due to advances in next-generation sequencing technologies. Nevertheless, the cost of carrying out large-scale cohort studies using WGS is still daunting. Past simulation studies with coverage at ~2x have shown promise for using low coverage WGS in studies focused on variant discovery, association study replications, and population genomics characterization. However, the performance of low coverage WGS in populations with a complex history and no reference panel remains to be determined.
Results
South Indian populations are known to have a complex population structure and are an example of a major population group that lacks adequate reference panels. To test the performance of extremely low-coverage WGS (EXL-WGS) in populations with a complex history and to provide a reference resource for South Indian populations, we performed EXL-WGS on 185 South Indian individuals from eight populations to ~1.6x coverage. Using two variant discovery pipelines, SNPTools and GATK, we generated a consensus call set that has ~90% sensitivity for identifying common variants (minor allele frequency ≥ 10%). Imputation further improves the sensitivity of our call set. In addition, we obtained high-coverage for the whole mitochondrial genome to infer the maternal lineage evolutionary history of the Indian samples.
Conclusions
Overall, we demonstrate that EXL-WGS with imputation can be a valuable study design for variant discovery with a dramatically lower cost than standard WGS, even in populations with a complex history and without available reference data. In addition, the South Indian EXL-WGS data generated in this study will provide a valuable resource for future Indian genomic studies.
Journal Article
Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing
by
Milosavljevic, Aleksandar
,
Chen, Zuozhou
,
Coarfa, Cristian
in
Algorithms
,
Base Sequence
,
Bioinformatics
2010
Background
Massively parallel sequencing readouts of epigenomic assays are enabling integrative genome-wide analyses of genomic and epigenomic variation. Pash 3.0 performs sequence comparison and read mapping and can be employed as a module within diverse configurable analysis pipelines, including ChIP-Seq and methylome mapping by whole-genome bisulfite sequencing.
Results
Pash 3.0 generally matches the accuracy and speed of niche programs for fast mapping of short reads, and exceeds their performance on longer reads generated by a new generation of massively parallel sequencing technologies. By exploiting longer read lengths, Pash 3.0 maps reads onto the large fraction of genomic DNA that contains repetitive elements and polymorphic sites, including indel polymorphisms.
Conclusions
We demonstrate the versatility of Pash 3.0 by analyzing the interaction between CpG methylation, CpG SNPs, and imprinting based on publicly available whole-genome shotgun bisulfite sequencing data. Pash 3.0 makes use of gapped k-mer alignment, a non-seed based comparison method, which is implemented using multi-positional hash tables. This allows Pash 3.0 to run on diverse hardware platforms, including individual computers with standard RAM capacity, multi-core hardware architectures and large clusters.
Journal Article
A hybrid computational strategy to address WGS variant analysis in >5000 samples
by
Huang, Zhuoyi
,
Carroll, Andrew
,
Boerwinkle, Eric
in
Algorithms
,
BASIC BIOLOGICAL SCIENCES
,
Big data
2016
Background
The decreasing costs of sequencing are driving the need for cost effective and real time variant calling of whole genome sequencing data. The scale of these projects are far beyond the capacity of typical computing resources available with most research labs. Other infrastructures like the cloud AWS environment and supercomputers also have limitations due to which large scale joint variant calling becomes infeasible, and infrastructure specific variant calling strategies either fail to scale up to large datasets or abandon joint calling strategies.
Results
We present a high throughput framework including multiple variant callers for single nucleotide variant (SNV) calling, which leverages hybrid computing infrastructure consisting of cloud AWS, supercomputers and local high performance computing infrastructures. We present a novel binning approach for large scale joint variant calling and imputation which can scale up to over 10,000 samples while producing SNV callsets with high sensitivity and specificity. As a proof of principle, we present results of analysis on Cohorts for Heart And Aging Research in Genomic Epidemiology (CHARGE) WGS freeze 3 dataset in which joint calling, imputation and phasing of over 5300 whole genome samples was produced in under 6 weeks using four state-of-the-art callers. The callers used were SNPTools, GATK-HaplotypeCaller, GATK-UnifiedGenotyper and GotCloud. We used Amazon AWS, a 4000-core in-house cluster at Baylor College of Medicine, IBM power PC Blue BioU at Rice and Rhea at Oak Ridge National Laboratory (ORNL) for the computation. AWS was used for joint calling of 180 TB of BAM files, and ORNL and Rice supercomputers were used for the imputation and phasing step. All other steps were carried out on the local compute cluster. The entire operation used 5.2 million core hours and only transferred a total of 6 TB of data across the platforms.
Conclusions
Even with increasing sizes of whole genome datasets, ensemble joint calling of SNVs for low coverage data can be accomplished in a scalable, cost effective and fast manner by using heterogeneous computing platforms without compromising on the quality of variants.
Journal Article
Association of Single Nucleotide Polymorphisms in the ST3GAL4 Gene with VWF Antigen and Factor VIII Activity
2016
VWF is extensively glycosylated with biantennary core fucosylated glycans. Most N-linked and O-linked glycans on VWF are sialylated. FVIII is also glycosylated, with a glycan structure similar to that of VWF. ST3GAL sialyltransferases catalyze the transfer of sialic acids in the α2,3 linkage to termini of N- and O-glycans. This sialic acid modification is critical for VWF synthesis and activity. We analyzed genetic and phenotypic data from the Atherosclerosis Risk in Communities (ARIC) study for the association of single nucleotide polymorphisms (SNPs) in the ST3GAL4 gene with plasma VWF levels and FVIII activity in 12,117 subjects. We also analyzed ST3GAL4 SNPs found in 2,535 subjects of 26 ethnicities from the 1000 Genomes (1000G) project for ethnic diversity, SNP imputation, and ST3GAL4 haplotypes. We identified 14 and 1,714 ST3GAL4 variants in the ARIC GWAS and 1000G databases respectively, with 46% being ethnically diverse in their allele frequencies. Among the 14 ST3GAL4 SNPs found in ARIC GWAS, the intronic rs2186717, rs7928391, and rs11220465 were associated with VWF levels and with FVIII activity after adjustment for age, BMI, hypertension, diabetes, ever-smoking status, and ABO. This study illustrates the power of next-generation sequencing in the discovery of new genetic variants and a significant ethnic diversity in the ST3GAL4 gene. We discuss potential mechanisms through which these intronic SNPs regulate ST3GAL4 biosynthesis and the activity that affects VWF and FVIII.
Journal Article