Catalogue Search | MBRL

Single-Molecule Real-Time Sequencing Combined with Optical Mapping Yields Completely Finished Fungal Genome

by van den Berg, Grardy C. M. , Janssen, Antoine , Datema, Erwin in Chromosome Mapping - methods , Chromosomes , Datasets

2015

Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae . Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio-generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism's biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes. IMPORTANCE Studying whole-genome sequences has become an important aspect of biological research. The advent of next-generation sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBio-generated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome analyses to facilitate functional studies into an organism's biology. Studying whole-genome sequences has become an important aspect of biological research. The advent of next-generation sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBio-generated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome analyses to facilitate functional studies into an organism's biology.

Journal Article

Share this book

Add to My Shelf

Cross-Species Bacterial Artificial Chromosome-Fluorescence in Situ Hybridization Painting of the Tomato and Potato Chromosome 6 Reveals Undescribed Chromosomal Rearrangements

by Vossen, Edwin A.G. van der , Datema, Erwin , Lankhorst, Rene Klein in Artificial chromosomes , bacterial artificial chromosomes , Chromosome Aberrations

2008

Ongoing genomics projects of tomato (Solanum lycopersicum) and potato (S. tuberosum) are providing unique tools for comparative mapping studies in Solanaceae. At the chromosomal level, bacterial artificial chromosomes (BACs) can be positioned on pachytene complements by fluorescence in situ hybridization (FISH) on homeologous chromosomes of related species. Here we present results of such a cross-species multicolor cytogenetic mapping of tomato BACs on potato chromosomes 6 and vice versa. The experiments were performed under low hybridization stringency, while blocking with Cot-100 was essential in suppressing excessive hybridization of repeat signals in both within-species FISH and cross-species FISH of tomato BACs. In the short arm we detected a large paracentric inversion that covers the whole euchromatin part with breakpoints close to the telomeric heterochromatin and at the border of the short arm pericentromere. The long arm BACs revealed no deviation in the colinearity between tomato and potato. Further comparison between tomato cultivars Cherry VFNT and Heinz 1706 revealed colinearity of the tested tomato BACs, whereas one of the six potato clones (RH98-856-18) showed minor putative rearrangements within the inversion. Our results present cross-species multicolor BAC–FISH as a unique tool for comparative genetic studies across Solanum species.

Journal Article

Share this book

Add to My Shelf

Homologues of potato chromosome 5 show variable collinearity in the euchromatin, but dramatic absence of sequence similarity in the pericentromeric heterochromatin

by van Ham, Roeland C H J , de Jong, Hans , Borm, Theo J A in Analysis , Animal Genetics and Genomics , Biomedical and Life Sciences

2015

Background In flowering plants it has been shown that de novo genome assemblies of different species and genera show a significant drop in the proportion of alignable sequence. Within a plant species, however, it is assumed that different haplotypes of the same chromosome align well. In this paper we have compared three de novo assemblies of potato chromosome 5 and report on the sequence variation and the proportion of sequence that can be aligned. Results For the diploid potato clone RH89-039-16 (RH) we produced two linkage phase controlled and haplotype-specific assemblies of chromosome 5 based on BAC-by-BAC sequencing, which were aligned to each other and compared to the 52 Mb chromosome 5 reference sequence of the doubled monoploid clone DM 1–3 516 R44 (DM). We identified 17.0 Mb of non-redundant sequence scaffolds derived from euchromatic regions of RH and 38.4 Mb from the pericentromeric heterochromatin. For 32.7 Mb of the RH sequences the correct position and order on chromosome 5 was determined, using genetic markers, fluorescence in situ hybridisation and alignment to the DM reference genome. This ordered fraction of the RH sequences is situated in the euchromatic arms and in the heterochromatin borders. In the euchromatic regions, the sequence collinearity between the three chromosomal homologs is good, but interruption of collinearity occurs at nine gene clusters. Towards and into the heterochromatin borders, absence of collinearity due to structural variation was more extensive and was caused by hemizygous and poorly aligning regions of up to 450 kb in length. In the most central heterochromatin, a total of 22.7 Mb sequence from both RH haplotypes remained unordered. These RH sequences have very few syntenic regions and represent a non-alignable region between the RH and DM heterochromatin haplotypes of chromosome 5. Conclusions Our results show that among homologous potato chromosomes large regions are present with dramatic loss of sequence collinearity. This stresses the need for more de novo reference assemblies in order to capture genome diversity in this crop. The discovery of three highly diverged pericentric heterochromatin haplotypes within one species is a novelty in plant genome analysis. The possible origin and cytogenetic implication of this heterochromatin haplotype diversity are discussed.

Journal Article

Share this book

Add to My Shelf

Correction: The Genomes of the Fungal Plant Pathogens Cladosporium fulvum and Dothistroma septosporum Reveal Adaptation to Different Hosts and Lifestyles But Also Signatures of Common Ancestry

by Dhillon, Braham , Lindquist, Erika , Jashni, Mansoor Karimi

2015

Journal Article

Share this book

Add to My Shelf

FISH mapping and molecular organization of the major repetitive sequences of tomato

by Vosman, Ben , de Jong, Hans , Kuipers, Anja in Animal Genetics and Genomics , Bacteria , bacterial artificial chromosomes

2008

This paper presents a bird's-eye view of the major repeats and chromatin types of tomato. Using fluorescence in-situ hybridization (FISH) with Cot-1, Cot-10 and Cot-100 DNA as probes we mapped repetitive sequences of different complexity on pachytene complements. Cot-100 was found to cover all heterochromatin regions, and could be used to identify repeat-rich clones in BAC filter hybridization. Next we established the chromosomal locations of the tandem and dispersed repeats with respect to euchromatin, nucleolar organizer regions (NORs), heterochromatin, and centromeres. The tomato genomic repeats TGRII and TGRIII appeared to be major components of the pericentromeres, whereas the newly discovered TGRIV repeat was found mainly in the structural centromeres. The highly methylated NOR of chromosome 2 is rich in [GACA]₄, a microsatellite that also forms part of the pericentromeres, together with [GA]₈, [GATA]₄ and Ty1-copia. Based on the morphology of pachytene chromosomes and the distribution of repeats studied so far, we now propose six different chromatin classes for tomato: (1) euchromatin, (2) chromomeres, (3) distal heterochromatin and interstitial heterochromatic knobs, (4) pericentromere heterochromatin, (5) functional centromere heterochromatin and (6) nucleolar organizer region.

Journal Article

Share this book

Add to My Shelf

A PARTHENOGENESIS allele from apomictic dandelion can induce egg cell division without fertilization in lettuce

by Radoeva, Tatyana , Mansveld, Sandra , Blom, Evert-Jan in 631/136/2086 , 631/208/2491 , 631/337/2019

2022

Apomixis, the clonal formation of seeds, is a rare yet widely distributed trait in flowering plants. We have isolated the PARTHENOGENESIS ( PAR ) gene from apomictic dandelion that triggers embryo development in unfertilized egg cells. PAR encodes a K2-2 zinc finger, EAR-domain protein. Unlike the recessive sexual alleles, the dominant PAR allele is expressed in egg cells and has a miniature inverted-repeat transposable element (MITE) transposon insertion in the promoter. The MITE-containing promoter can invoke a homologous gene from sexual lettuce to complement dandelion LOSS OF PARTHENOGENESIS mutants. A similar MITE is also present in the promoter of the PAR gene in apomictic forms of hawkweed, suggesting a case of parallel evolution. Heterologous expression of dandelion PAR in lettuce egg cells induced haploid embryo-like structures in the absence of fertilization. Sexual PAR alleles are expressed in pollen, suggesting that the gene product releases a block on embryogenesis after fertilization in sexual species while in apomictic species PAR expression triggers embryogenesis in the absence of fertilization. The PARTHENOGENESIS ( PAR ) gene is identified in apomictic dandelion. A dominant allele has a MITE transposon insertion similar to that found in apomictic hawkweed. Expression of dandelion PAR in lettuce induces embryo-like structures without fertilization.

Journal Article

Share this book

Add to My Shelf

De novo sequencing, assembly and analysis of the genome of the laboratory strain Saccharomyces cerevisiae CEN.PK113-7D, a model for modern industrial biotechnology

by de Kok, Stefan , Vongsangnak, Wanwipa , Paddon, Chris J in Adenylate cyclase , alcoholic fermentation , Applied Microbiology

2012

Saccharomyces cerevisiae CEN.PK 113-7D is widely used for metabolic engineering and systems biology research in industry and academia. We sequenced, assembled, annotated and analyzed its genome. Single-nucleotide variations (SNV), insertions/deletions (indels) and differences in genome organization compared to the reference strain S. cerevisiae S288C were analyzed. In addition to a few large deletions and duplications, nearly 3000 indels were identified in the CEN.PK113-7D genome relative to S288C. These differences were overrepresented in genes whose functions are related to transcriptional regulation and chromatin remodelling. Some of these variations were caused by unstable tandem repeats, suggesting an innate evolvability of the corresponding genes. Besides a previously characterized mutation in adenylate cyclase, the CEN.PK113-7D genome sequence revealed a significant enrichment of non-synonymous mutations in genes encoding for components of the cAMP signalling pathway. Some phenotypic characteristics of the CEN.PK113-7D strains were explained by the presence of additional specific metabolic genes relative to S288C. In particular, the presence of the BIO1 and BIO6 genes correlated with a biotin prototrophy of CEN.PK113-7D. Furthermore, the copy number, chromosomal location and sequences of the MAL loci were resolved. The assembled sequence reveals that CEN.PK113-7D has a mosaic genome that combines characteristics of laboratory strains and wild-industrial strains.

Journal Article

Share this book

Add to My Shelf

Population genomic analysis reveals differential evolutionary histories and patterns of diversity across subgenomes and subpopulations of Brassica napus L

by Brown, Jack , Gore, Michael A. , Datema, Erwin in Adaptation , allopolyploidy , biogeography

2016

Brassica napus (L.) is a crop of major economic importance that produces canola oil (seed), vegetables, fodder and animal meal. Characterizing the genetic diversity present in the extant germplasm pool of B. napus is fundamental to better conserve, manage and utilize the genetic resources of this species. We used sequence-based genotyping to detect and genotype 30,881 SNPs in a diversity panel of 782 B. napus samples representative of the major ecotypes and worldwide geographic distribution of the species. Given that B. napus is an allotetraploid, we focused our analysis on patterns of genetic variation that differ between the two subgenomes. Our results reveal a strong population structure, mainly splitting accessions of spring types (SP), winter Europeans (WE) and winter Asian (WA) types. The number of population-specific SNPs was the highest in WA and comparable between SP and WE. However, the SNPs in WE had on average a lower frequency than in SP. Phylogenetic inferences showed a different evolutionary history in the two subgenomes, placing WE and WA as basal clades for the other populations in the C and A subgenomes, respectively. Finally, we identified 16 genomic regions where the patterns of diversity differ markedly from the genome average, several of which are consistent with putative genomic inversions. The SNPs discovered in our analysis are publicly available, providing a valuable resource for the community.

Journal Article

Share this book

Add to My Shelf

Chromosomal organizations of major repeat families on potato (Solanum tuberosum) and further exploring in its sequenced genome

by Guzman, Myriam Olortegui , de Jong, Hans , Datema, Erwin in Animal Genetics and Genomics , bacterial artificial chromosomes , Base Sequence

2014

One of the most powerful technologies in unraveling the organization of a eukaryotic plant genome is high-resolution Fluorescent in situ hybridization of repeats and single copy DNA sequences on pachytene chromosomes. This technology allows the integration of physical mapping information with chromosomal positions, including centromeres, telomeres, nucleolar-organizing region, and euchromatin and heterochromatin. In this report, we established chromosomal positions of different repeat fractions of the potato genomic DNA (Cot100, Cot500 and Cot1000) on the chromosomes. We also analysed various repeat elements that are unique to potato including the moderately repetitive P5 and REP2 elements, where the REP2 is part of a larger Gypsy-type LTR retrotransposon and cover most chromosome regions, with some brighter fluorescing spots in the heterochromatin. The most abundant tandem repeat is the potato genomic repeat 1 that covers subtelomeric regions of most chromosome arms. Extensive multiple alignments of these repetitive sequences in the assembled RH89-039-16 potato BACs and the draft assembly of the DM1-3 516 R44 genome shed light on the conservation of these repeats within the potato genome. The consensus sequences thus obtained revealed the native complete transposable elements from which they were derived.

Journal Article

Share this book

Add to My Shelf

Genome Bioinformatics of Tomato and Potato

by Datema, Erwin in Agricultural research , Agriculture , Agronomy

2011

In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformatics strategies and tools, both those that are required for data processing, management, and quality control, and those used for interpretation of the data. This thesis focuses on the application of genome sequencing, assembly and annotation to two members of the Solanaceae family, tomato and potato. Potato is the economically most important species within the Solanaceae, and its tubers contribute to dietary intake of starch, protein, antioxidants, and vitamins. Tomato fruits are the second most consumed vegetable after potato, and are a globally important dietary source of lycopene, beta-carotene, vitamin C, and fiber. The chapters in this thesis document the generation, exploitation and interpretation of genomic sequence resources for these two species and shed light on the contents, structure and evolution of their genomes. Chapter 1introduces the concepts of genome sequencing, assembly and annotation, and explains the novel genome sequencing technologies that have been developed in the past decade. These so-called Next Generation Sequencing platforms display considerable variation in chemistry and workflow, and as a consequence the throughput and data quality differs by orders of magnitude between the platforms. The currently available sequencing platforms produce a vast variety of read lengths and facilitate the generation of paired sequences with an approximately fixed distance between them. The choice of sequencing chemistry and platform combined with the type of sequencing template demands specifically adapted bioinformatics for data processing and interpretation. Irrespective of the sequencing and assembly strategy that is chosen, the resulting genome sequence, often represented by a collection of long linear strings of nucleotides, is of limited interest by itself. Interpretation of the genome can only be achieved through sequence annotation – that is, identification and classification of all functional elements in a genome sequence. Once these elements have been annotated, sequence alignments between multiple genomes of related accessions or species can be utilized to reveal the genetic variation on both the nucleotide and the structural level that underlies the difference between these species or accessions. Chapter 2describes BlastIf, a novel software tool that exploits sequence similarity searches with BLAST to provide a straightforward annotation of long nucleotide sequences. Generally, two problems are associated with the alignment of a long nucleotide sequence to a database of short gene or protein sequences: (i) the large number of similar hits that can be generated due to database redundancy; and (ii) the relationships implied between aligned segments within a hit that in fact correspond to distinct elements on the sequence such as genes. BlastIf generates a comprehensible BLAST output for long nucleotide sequences by reducing the number of similar hits while revealing most of the variation present between hits. It is a valuable tool for molecular biologists who wish to get a quick overview of the genetic elements present in a newly sequenced segment of DNA, prior to more elaborate efforts of gene structure prediction and annotation. In Chapter 3 a first genome-wide comparison between the emerging genomic sequence resources of tomato and potato is presented. Large collections of BAC end sequences from both species were annotated through repeat searches, transcript alignments and protein domain identification. In-depth comparisons of the annotated sequences revealed remarkable differences in both gene and repeat content between these closely related genomes. The tomato genome was found to be more repetitive than the potato genome, and substantial differences in the distribution of Gypsy and Copia retrotransposable elements as well as microsatellites were observed between the two genomes. A higher gene content was identified in the potato sequences, and in particular several large gene families including cytochrome P450 mono-oxygenases and serine-threonine protein kinases were significantly overrepresented in potato compared to tomato. Moreover, the cytochrome P450 gene family was found to be expanded in both tomato and potato when compared to Arabidopsis thaliana, suggesting an expanded network of secondary metabolic pathways in the Solanaceae. Together these findings present a first glimpse into the evolution of Solanaceous genomes, both within the family and relative to other plant species. Chapter 4explores the physical and genetic organization of tomato chromosome 6 through integration of BAC sequence analysis, High Information Content Fingerprinting, genetic analysis, and BAC-FISH mapping data. A collection of BACs spanning substantial parts of the short and long arm euchromatin and several dispersed regions of the pericentrometric heterochromatin were sequenced and assembled into several tiling paths spanning approximately 11 Mb. Overall, the cytogenetic order of BACs was in agreement with the order of BACs anchored to the Tomato EXPEN 2000 genetic map, although a few striking discrepancies were observed. The integration of BAC-FISH, sequence and genetic mapping data furthermore provided a clear picture of the borders between eu- and heterochromatin on chromosome 6. Annotation of the BAC sequences revealed that, although the majority of protein-coding genes were located in the euchromatin, the highly repetitive pericentromeric heterochromatin displayed an unexpectedly high gene content. Moreover, the short arm euchromatin was relatively rich in repeats, but the ratio of Gypsy and Copia retrotransposons across the different domains of the chromosome clearly distinguished euchromatin from heterochromatin. The ongoing whole-genome sequencing effort will reveal if these properties are unique for tomato chromosome 6, or a more general property of the tomato genome. Chapter 5presents the potato genome, the first genome sequence of an Asterid. To overcome the problems associated with genome assembly due tothe high level of heterozygosity that is observed in commercial tetraploid potato varieties, a homozygous doubled-monoploid potato clone was exploited to sequence and assemble 86% of the 844 Mb genome. This potato reference genome sequence was complemented with re-sequencing of aheterozygous diploid clone, revealing the form and extent of sequence polymorphism both between different genotypes and within a single heterozygous genotype. Gene presence/absence variants and other potentially deleterious mutations were found to occur frequently in potato and are a likely cause of inbreeding depression. Annotation of the genome was supported by deep transcriptome sequencing of both the doubled-monoploid and the heterozygous potato, resulting in the prediction of more than 39,000 protein coding genes. Transcriptome analysis provided evidence for the contribution of gene family expansion, tissue specific expression, and recruitment of genes to new pathways to the evolution of tuber development. The sequence of the potato genome has provided new insights into Eudicot genome evolution and has provided a solid basis for the elucidation of the evolution of tuberisation. Many traits of interest to plant breeders are quantitative in nature and the potato sequence will simplify both their characterization and deployment to generate novel cultivars. The outstanding challenges in plant genome sequencing are addressed in Chapter 6. The high concentration of repetitive elements and the heterozygosity and polyploidy of many interesting crop plant species currently pose a barrier for the efficient reconstruction of their genome sequences. Nonetheless, the completion of a large number of new genome sequences in recent years and the ongoing advances in sequencing technology provide many excitingopportunities for plant breeding and genome research. Current sequencing platforms are being continuously updated and improved, and novel technologies are being developed and implemented in third-generation sequencing platforms that sequence individual molecules without need for amplification. While these technologies create exciting opportunities for new sequencing applications, they also require robust software tools to process the data produced through them efficiently. The ever increasing amount of available genome sequences creates the need for an intuitive platform for the automated and reproducible interrogation of these data in order to formulate new biologically relevant questions on datasets spanning hundreds or thousands of genome sequences.

Dissertation

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter