Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
12,865
result(s) for
"diploidy"
Sort by:
Haplotype-resolved assembly of diploid genomes without parental data
by
Jarvis, Erich D.
,
Fedrigo, Olivier
,
Gemmell, Neil J.
in
631/114/2785/2302
,
631/114/794
,
Agriculture
2022
Routine haplotype-resolved genome assembly from single samples remains an unresolved problem. Here we describe an algorithm that combines PacBio HiFi reads and Hi-C chromatin interaction data to produce a haplotype-resolved assembly without the sequencing of parents. Applied to human and other vertebrate samples, our algorithm consistently outperforms existing single-sample assembly pipelines and generates assemblies of similar quality to the best pedigree-based assemblies.
Haplotype-resolved genome assemblies are generated by combining HiFi reads with Hi-C long-range interactions.
Journal Article
A draft human pangenome reference
2023
Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals
1
. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.
An initial draft of the human pangenome is presented and made publicly available by the Human Pangenome Reference Consortium; the draft contains 94 de novo haplotype assemblies from 47 ancestrally diverse individuals.
Journal Article
Telomere-to-telomere assembly of diploid chromosomes with Verkko
by
Rautiainen, Mikko
,
Koren, Sergey
,
Rhie, Arang
in
631/114/2785/2302
,
631/208/514/1948
,
631/208/728
2023
The Telomere-to-Telomere consortium recently assembled the first truly complete sequence of a human genome. To resolve the most complex repeats, this project relied on manual integration of ultra-long Oxford Nanopore sequencing reads with a high-resolution assembly graph built from long, accurate PacBio high-fidelity reads. We have improved and automated this strategy in Verkko, an iterative, graph-based pipeline for assembling complete, diploid genomes. Verkko begins with a multiplex de Bruijn graph built from long, accurate reads and progressively simplifies this graph by integrating ultra-long reads and haplotype-specific markers. The result is a phased, diploid assembly of both haplotypes, with many chromosomes automatically assembled from telomere to telomere. Running Verkko on the HG002 human genome resulted in 20 of 46 diploid chromosomes assembled without gaps at 99.9997% accuracy. The complete assembly of diploid genomes is a critical step towards the construction of comprehensive pangenome databases and chromosome-scale comparative genomics.
Integration of long and ultra-long reads results in improved phased, diploid assemblies.
Journal Article
A robust benchmark for detection of germline large deletions and insertions
2020
New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.
Detection of structural variants in the human genome is facilitated by a benchmark set of large deletions and insertions.
Journal Article
Phased diploid genome assembly with single-molecule real-time sequencing
by
Schatz, Michael C
,
Rank, David R
,
Nattestad, Maria
in
631/114/2785/2302
,
631/61/514/1948
,
Algorithms
2016
The open-source FALCON and FALCON-Unzip software utilize long-read sequencing data to generate contiguous, accurate and phased diploid assemblies, even from genomes that are highly heterozygous.
While genome assembly projects have been successful in many haploid and inbred species, the assembly of noninbred or rearranged heterozygous genomes remains a major challenge. To address this challenge, we introduce the open-source FALCON and FALCON-Unzip algorithms (
https://github.com/PacificBiosciences/FALCON/
) to assemble long-read sequencing data into highly accurate, contiguous, and correctly phased diploid genomes. We generate new reference sequences for heterozygous samples including an F1 hybrid of
Arabidopsis thaliana
, the widely cultivated
Vitis vinifera
cv. Cabernet Sauvignon, and the coral fungus
Clavicorona pyxidata
, samples that have challenged short-read assembly approaches. The FALCON-based assemblies are substantially more contiguous and complete than alternate short- or long-read approaches. The phased diploid assembly enabled the study of haplotype structure and heterozygosities between homologous chromosomes, including the identification of widespread heterozygous structural variation within coding sequences.
Journal Article
Phased diploid genome assemblies and pan-genomes provide insights into the genetic history of apple domestication
2020
Domestication of the apple was mainly driven by interspecific hybridization. In the present study, we report the haplotype-resolved genomes of the cultivated apple (
Malus domestica
cv. Gala) and its two major wild progenitors,
M. sieversii
and
M. sylvestris
. Substantial variations are identified between the two haplotypes of each genome. Inference of genome ancestry identifies ~23% of the Gala genome as of hybrid origin. Deep sequencing of 91 accessions identifies selective sweeps in cultivated apples that originated from either of the two progenitors and are associated with important domestication traits. Construction and analyses of apple pan-genomes uncover thousands of new genes, with hundreds of them being selected from one of the progenitors and largely fixed in cultivated apples, revealing that introgression of new genes/alleles is a hallmark of apple domestication through hybridization. Finally, transcriptome profiles of Gala fruits at 13 developmental stages unravel ~19% of genes displaying allele-specific expression, including many associated with fruit quality.
Phased diploid genomes of the cultivated apple
Malus domestica
cv. Gala and its two major wild progenitors
M. sieversii
and
M. sylvestris
, as well as pan-genome analyses, provide insights into the genetic history of apple domestication.
Journal Article
Reference genome assemblies reveal the origin and evolution of allohexaploid oat
2022
Common oat (
Avena sativa
) is an important cereal crop serving as a valuable source of forage and human food. Although reference genomes of many important crops have been generated, such work in oat has lagged behind, primarily owing to its large, repeat-rich polyploid genome. Here, using Oxford Nanopore ultralong sequencing and Hi-C technologies, we have generated a reference-quality genome assembly of hulless common oat, comprising 21 pseudomolecules with a total length of 10.76 Gb and contig N50 of 75.27 Mb. We also produced genome assemblies for diploid and tetraploid
Avena
ancestors, which enabled the identification of oat subgenomes and provided insights into oat chromosomal evolution. The origin of hexaploid oat is inferred from whole-genome sequencing, chloroplast genomes and transcriptome assemblies of different
Avena
species. These findings and the high-quality reference genomes presented here will facilitate the full use of crop genetic resources to accelerate oat improvement.
A reference-quality genome assembly of hexaploid oat variety ‘Sanfensan’ and genome assemblies of its diploid and tetraploid
Avena
ancestors provide insights into the evolutionary history of allohexaploid oat.
Journal Article
Uncovering hidden variation in polyploid wheat
by
Fosker, Christine
,
Ramirez-Gonzalez, Ricardo H.
,
Clissold, Leah
in
Agricultural Sciences
,
Biological Sciences
,
Crops
2017
Comprehensive reverse genetic resources, which have been key to understanding gene function in diploid model organisms, are missing in many polyploid crops. Young polyploid species such as wheat, which was domesticated less than 10,000 y ago, have high levels of sequence identity among subgenomes that mask the effects of recessive alleles. Such redundancy reduces the probability of selection of favorable mutations during natural or human selection, but also allows wheat to tolerate high densities of induced mutations. Here we exploited this property to sequence and catalog more than 10 million mutations in the protein-coding regions of 2,735 mutant lines of tetraploid and hexaploid wheat. We detected, on average, 2,705 and 5,351 mutations per tetraploid and hexaploid line, respectively, which resulted in 35–40 mutations per kb in each population. With these mutation densities, we identified an average of 23–24 missense and truncation alleles per gene, with at least one truncation or deleterious missense mutation in more than 90% of the captured wheat genes per population. This public collection of mutant seed stocks and sequence data enables rapid identification of mutations in the different copies of the wheat genes, which can be combined to uncover previously hidden variation. Polyploidy is a central phenomenon in plant evolution, and many crop species have undergone recent genome duplication events. Therefore, the general strategy and methods developed herein can benefit other polyploid crops.
Journal Article