Catalogue Search | MBRL

Genetic Variation, Comparative Genomics, and the Diagnosis of Disease

by Eichler, Evan E in Chromosomes , Deoxyribonucleic acid , Disease - genetics

2019

The genome is not akin to a string of fixed length. Many large segments of DNA may be present or absent — a major contributor to pathogenic genomic variation. New technologies in DNA sequencing are helping to uncover this type of variation, which often cannot be detected by standard DNA sequencing.

Journal Article

Share this book

Add to My Shelf

Limitations of next-generation genome sequence assembly

by Alkan, Can , Eichler, Evan E , Sajjadian, Saba in 631/1647/514/2254 , 631/208/212 , Algorithms

2011

High-throughput sequencing technologies promise to transform the fields of genetics and comparative biology by delivering tens of thousands of genomes in the near future. Although it is feasible to construct de novo genome assemblies in a few months, there has been relatively little attention to what is lost by sole application of short sequence reads. We compared the recent de novo assemblies using the short oligonucleotide analysis package (SOAP), generated from the genomes of a Han Chinese individual and a Yoruban individual, to experimentally validated genomic features. We found that de novo assemblies were 16.2% shorter than the reference genome and that 420.2 megabase pairs of common repeats and 99.1% of validated duplicated sequences were missing from the genome. Consequently, over 2,377 coding exons were completely missing. We conclude that high-quality sequencing approaches must be considered in conjunction with high-throughput sequencing for comparative genomics analyses and studies of genome evolution.

Journal Article

Share this book

Add to My Shelf

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data

by Klammer, Aaron A , Eichler, Evan E , Huddleston, John in 631/114/2785 , 631/1647/2217 , 631/1647/514/1948

2013

Unlike hybrid approaches that use multiple libraries for de novo assembly, the hierarchical genome-assembly process uses data from only a single long-read SMRT sequencing library to produce high-quality finished microbial genome or BAC assemblies in an automated workflow. We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph–based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.

Journal Article

Share this book

Add to My Shelf

Genetic variation and the de novo assembly of human genomes

by Eichler, Evan E. , Wilson, Richard K. , Chaisson, Mark J. P. in 631/114/2785 , 631/114/2785/2302 , 631/208/212/2301

2015

Key Points Complete de novo assembly of a genome is guaranteed to allow assessment of the full range of genetic variation, although the only mammalian genome assemblies completed to date are for human and mouse. Assemblies using massively parallel sequencing (MPS) have increased the diversity of draft genomes that are available but do not completely resolve genomes. When designing a de novo assembly project, the most-suitable assembly approach to use differs depending on the characteristics of the sequencing reads. MPS methods have relied on de Bruijn graphs, whereas single-molecule sequencing (SMS) reads require pairwise overlaps encoded in overlap or string graphs. A component of 'missing heritability' is missed sequence variation. Approximately 5–40 Mb of sequence are absent from any given human reference genome owing to structural polymorphism, and standard resequencing has missed detection of diseases such as medullary cystic kidney disease type 1, amyotrophic lateral sclerosis and facioscapulohumeral muscular dystrophy. Single-molecule long-read sequencing is currently driving gains in genome assembly accuracy and completeness, but new technologies are being developed to generate long-range information, such as optical maps and dilution pool sequencing, that may aid in scaffolding complex regions. The wealth of existing and emerging DNA-sequencing data provides an opportunity for a comprehensive understanding of human genetic variation, including the discovery of disease-causing variants. This Review describes how the limitations of current reference-genome assemblies confound the characterization of genetic variation and how this can be mitigated by important advances in algorithms and sequencing technology that facilitate the de novo assembly of genomes. The discovery of genetic variation and the assembly of genome sequences are both inextricably linked to advances in DNA-sequencing technology. Short-read massively parallel sequencing has revolutionized our ability to discover genetic variation but is insufficient to generate high-quality genome assemblies or resolve most structural variation. Full resolution of variation is only guaranteed by complete de novo assembly of a genome. Here, we review approaches to genome assembly, the nature of gaps or missing sequences, and biases in the assembly process. We describe the challenges of generating a complete de novo genome assembly using current technologies and the impact that being able to perfectly sequence the genome would have on understanding human disease and evolution. Finally, we summarize recent technological advances that improve both contiguity and accuracy and emphasize the importance of complete de novo assembly as opposed to read mapping as the primary means to understanding the full range of human genetic variation.

Journal Article

Share this book

Add to My Shelf

Applications of long-read sequencing to Mendelian genetics

by Mastrorosa, Francesco Kumara , Eichler, Evan E. , Miller, Danny E. in Accuracy , Applications of technology in health and disease , Bioinformatics

2023

Advances in clinical genetic testing, including the introduction of exome sequencing, have uncovered the molecular etiology for many rare and previously unsolved genetic disorders, yet more than half of individuals with a suspected genetic disorder remain unsolved after complete clinical evaluation. A precise genetic diagnosis may guide clinical treatment plans, allow families to make informed care decisions, and permit individuals to participate in N-of-1 trials; thus, there is high interest in developing new tools and techniques to increase the solve rate. Long-read sequencing (LRS) is a promising technology for both increasing the solve rate and decreasing the amount of time required to make a precise genetic diagnosis. Here, we summarize current LRS technologies, give examples of how they have been used to evaluate complex genetic variation and identify missing variants, and discuss future clinical applications of LRS. As costs continue to decrease, LRS will find additional utility in the clinical space fundamentally changing how pathological variants are discovered and eventually acting as a single-data source that can be interrogated multiple times for clinical service.

Journal Article

Share this book

Add to My Shelf

An Incomplete Understanding of Human Genetic Variation

by Eichler, Evan E , Huddleston, John in Consortia , DNA Copy Number Variations , Genes

2016

Deciphering the genetic basis of human disease requires a comprehensive knowledge of genetic variants irrespective of their class or frequency. Although an impressive number of human genetic variants have been catalogued, a large fraction of the genetic difference that distinguishes two human genomes is still not understood at the base-pair level. This is because the emphasis has been on single-nucleotide variation as opposed to less tractable and more complex genetic variants, including indels and structural variants. The latter, we propose, will have a large impact on human phenotypes but require a more systematic assessment of genomes at deeper coverage and alternate sequencing and mapping technologies.

Journal Article

Share this book

Add to My Shelf

Prioritization of neurodevelopmental disease genes by discovery of new mutations

by Hoischen, Alexander , Eichler, Evan E , Krumm, Niklas in 45/41 , 49/23 , 49/47

2014

Advances in genome sequencing technologies have revolutionized the search for rare and penetrant mutations leading to diseases such as autism. Given that all individuals carry new and disruptive mutations, in this Review, the authors discuss ways to home in on pathogenic mutations associated with neurodevelopmental disorders. Advances in genome sequencing technologies have begun to revolutionize neurogenetics, allowing the full spectrum of genetic variation to be better understood in relation to disease. Exome sequencing of hundreds to thousands of samples from patients with autism spectrum disorder, intellectual disability, epilepsy and schizophrenia provides strong evidence of the importance of de novo and gene-disruptive events. There are now several hundred new candidate genes and targeted resequencing technologies that allow screening of dozens of genes in tens of thousands of individuals with high specificity and sensitivity. The decision of which genes to pursue depends on many factors, including recurrence, previous evidence of overlap with pathogenic copy number variants, the position of the mutation in the protein, the mutational burden among healthy individuals and membership of the candidate gene in disease-implicated protein networks. We discuss these emerging criteria for gene prioritization and the potential impact on the field of neuroscience.

Journal Article

Share this book

Add to My Shelf

Single-cell epigenomics reveals mechanisms of human cortical development

by Pollard, Katherine S. , Ament, Seth A. , Ahituv, Nadav in 13/51 , 38/35 , 45/47

2021

During mammalian development, differences in chromatin state coincide with cellular differentiation and reflect changes in the gene regulatory landscape 1 . In the developing brain, cell fate specification and topographic identity are important for defining cell identity 2 and confer selective vulnerabilities to neurodevelopmental disorders 3 . Here, to identify cell-type-specific chromatin accessibility patterns in the developing human brain, we used a single-cell assay for transposase accessibility by sequencing (scATAC-seq) in primary tissue samples from the human forebrain. We applied unbiased analyses to identify genomic loci that undergo extensive cell-type- and brain-region-specific changes in accessibility during neurogenesis, and an integrative analysis to predict cell-type-specific candidate regulatory elements. We found that cerebral organoids recapitulate most putative cell-type-specific enhancer accessibility patterns but lack many cell-type-specific open chromatin regions that are found in vivo. Systematic comparison of chromatin accessibility across brain regions revealed unexpected diversity among neural progenitor cells in the cerebral cortex and implicated retinoic acid signalling in the specification of neuronal lineage identity in the prefrontal cortex. Together, our results reveal the important contribution of chromatin state to the emerging patterns of cell type diversity and cell fate specification and provide a blueprint for evaluating the fidelity and robustness of cerebral organoids as a model for cortical development. Analysis of chromatin state at a single-cell level in samples of developing human forebrain demonstrate both cell-type-specific and region-specific changes during neurogenesis.

Journal Article

Share this book

Add to My Shelf

Excess of rare, inherited truncating mutations in autism

by Leal, Suzanne M , Eichler, Evan E , Stessman, Holly A in 45/61 , 45/77 , 631/208/212

2015

Evan Eichler and colleagues analyze the relative impact of de novo and rare, inherited variants on autism risk. They show a statistically independent role for rare, inherited mutations and implicate several new candidate genes likely contributing to autism risk. To assess the relative impact of inherited and de novo variants on autism risk, we generated a comprehensive set of exonic single-nucleotide variants (SNVs) and copy number variants (CNVs) from 2,377 families with autism. We find that private, inherited truncating SNVs in conserved genes are enriched in probands (odds ratio = 1.14, P = 0.0002) in comparison to unaffected siblings, an effect involving significant maternal transmission bias to sons. We also observe a bias for inherited CNVs, specifically for small (<100 kb), maternally inherited events ( P = 0.01) that are enriched in CHD8 target genes ( P = 7.4 × 10 −3 ). Using a logistic regression model, we show that private truncating SNVs and rare, inherited CNVs are statistically independent risk factors for autism, with odds ratios of 1.11 ( P = 0.0002) and 1.23 ( P = 0.01), respectively. This analysis identifies a second class of candidate genes (for example, RIMS1 , CUL7 and LZTR1 ) where transmitted mutations may create a sensitized background but are unlikely to be completely penetrant.

Journal Article

Share this book

Add to My Shelf

Resolving the complexity of the human genome using single-molecule sequencing

by Boitano, Matthew , Landolin, Jane M. , Stamatoyannopoulos, John A. in 45/23 , 631/208/212/748 , 631/208/726/649/2157

2015

Single-molecule, real-time DNA sequencing is used to analyse a haploid human genome (CHM1), thus closing or extending more than half of the remaining 164 euchromatic gaps in the human genome; the complete sequences of euchromatic structural variants (including inversions, complex insertions and tandem repeats) are resolved at the base-pair level, suggesting that a greater complexity of the human genome can now be accessed. Deep-sequencing the human genome The human genome is considered sequenced, yet more than 160 euchromatic gaps remain and many aspects of its structural variation are poorly understood. Evan Eichler and colleagues sequenced and analysed a haploid human genome (CHM1) using single-molecule, real-time (SMRT) DNA sequencing and by doing so closed — or in some cases extended — more than half of the remaining gaps. They also resolved the complete sequence of numerous euchromatic structural variants at the base-pair level, revealing inversions, complex insertions and long tracts of tandem repeats, some of them previously unknown. Thanks to the longer-read sequencing technology applied here, the complexity of the human genome that stems from variation of longer and more complex repetitive DNA can now be largely resolved. The human genome is arguably the most complete mammalian reference assembly 1 , 2 , 3 , yet more than 160 euchromatic gaps remain 4 , 5 , 6 and aspects of its structural variation remain poorly understood ten years after its completion 7 , 8 , 9 . To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing 10 . We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome—78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter