Catalogue Search | MBRL

Sequencing and automated whole-genome optical mapping of the genome of a domestic goat (Capra hircus)

by Yang, Shuang , Cockett, Noelle , Bian, Chao in 631/1647/1513/1382 , 631/61/212 , 631/61/514

2013

An instrument for whole-genome optical mapping is used to assemble the genome of the domestic goat into super-long scaffolds. We report the ∼2.66-Gb genome sequence of a female Yunnan black goat. The sequence was obtained by combining short-read sequencing data and optical mapping data from a high-throughput whole-genome mapping instrument. The whole-genome mapping data facilitated the assembly of super-scaffolds >5× longer by the N50 metric than scaffolds augmented by fosmid end sequencing (scaffold N50 = 3.06 Mb, super-scaffold N50 = 16.3 Mb). Super-scaffolds are anchored on chromosomes based on conserved synteny with cattle, and the assembly is well supported by two radiation hybrid maps of chromosome 1. We annotate 22,175 protein-coding genes, most of which were recovered in the RNA-seq data of ten tissues. Comparative transcriptomic analysis of the primary and secondary follicles of a cashmere goat reveal 51 genes that are differentially expressed between the two types of hair follicles. This study, whose results will facilitate goat genomics, shows that whole-genome mapping technology can be used for the de novo assembly of large genomes.

Journal Article

Share this book

Add to My Shelf

A genome-wide study of ruminants uncovers two endogenous retrovirus families recently active in goats

by Faraut, Thomas , Turpin, Jocelyn , Leroux, Caroline in Animal biology , Animal Genetics and Genomics , Biomedical and Life Sciences

2025

Background Endogenous retroviruses (ERV) are traces of ancestral retroviral germline infections that constitute a significant portion of mammalian genomes and are classified as LTR-retrotransposons. The exploration of their dynamics and evolutionary history in ruminants remains limited, highlighting the need for a comprehensive and thorough investigation of the ERV landscape in the genomes of cattle, sheep and goat. Results Through a de novo bioinformatic analysis, we characterized 24 Class I and II ERV families across four reference assemblies of domestic and wild sheep and goats, and one assembly of cattle. Among these families, 13 are represented by consensus sequences identified in the five analyzed species, while eight are exclusive to small ruminants and three to cattle. The similarity-based approach used to search for the presence of these families in other ruminant species revealed multiple endogenization events over the last 40 million years and distinct evolutionary dynamics among species. The ERV annotation resulted in a high-resolution dataset of 100,534 ERV insertions across the five genomes, representing between 0.5 and 1% of their genomes. Solo-LTRs account for 83.2% of the annotated insertions demonstrating that most of the ERVs are relics of past events. Two Class II families showed higher abundance and copy conservation in small ruminants. One of them is closely related to circulating exogenous retroviruses and is represented by 22 copies sharing identical LTRs and 12 with complete coding capacities in the domestic goat. Conclusions Our results suggest the presence of two ERV families with recent transpositional activity in ruminant genomes, particularly in the domestic goat, illustrating distinct evolutionary dynamics among the analyzed species. This work highlights the ongoing influence of ERVs on genomic landscapes and call for further investigation of their evolutionary trajectories in these genomes.

Journal Article

Share this book

Add to My Shelf

Decoupling transcriptome layers: the distinct and variable nature of circular RNAs

by Faraut, Thomas , Robic, Annie , Liaubet, Laurence in Analysis , Animals , Anopheles

2025

Background Circular RNAs (circRNAs) and mRNAs are distinct transcripts from the same genes, produced by different splicing mechanisms. This study investigates the behavior of the circular transcriptome relative to the linear one across biological conditions and tissues. We analyzed transcriptomic data from 36 bovine monocyte-derived macrophage (MDM) samples collected during an ex vivo Mycobacterium avium ssp . paratuberculosis (MAP) infection experiment, stratified by Johne’s disease (JD) antibody status (JD+ or JD−) and by infection condition (control or MAP infected). We extended our analysis to healthy bovine tissues, including neonatal and post-pubertal testes, and liver and muscle samples from 12 animals stratified by sex and feed efficiency. Results In the 36 MDM samples, we identified 3358 exonic circRNAs derived from 1895 genes. By comparing the mean expression levels of circRNAs and linear transcripts, and considering the number of expressed genes, we estimate that the circular transcriptome is approximately 100 times smaller than the linear transcriptome. Analyses of the circular and linear transcriptomes revealed that MAP infection impacted only the linear transcriptome of MDM_JD− . The other three transcriptomes—circular JD− , circular JD+ , and linear JD+ —showed no infection-specific response. In the testes, maturation was associated with profound but uncoordinated changes in the circular and linear transcriptomes. While circRNA abundance declined, the linear transcriptome underwent a complete reorganization marked by the activation of novel genes. In the liver, female samples clustered by feed efficiency only when the entire linear and top-expressed circular transcriptomes were considered, respectively. In MDMs, the circular transcriptomes of control and infected samples, as well as the JD+ linear transcriptome, were dominated by donor-specific signatures. In contrast, the JD− linear transcriptome reflected MAP infection, with infection-specific structuring overriding inter-individual variation. Conclusions In both MDM and tissue samples, circular and linear transcriptomes follow distinct and largely independent regulatory logics. While both capture inter-individual variation, circRNA expression appears more variable and may carry fewer physiological signals, especially when no clear phenotypic signature has been detected in the corresponding linear transcriptome. These findings demonstrate that circular and linear RNAs arise from complementary and nonredundant layers of gene regulation, emphasizing the importance of analyzing both in parallel.

Journal Article

Share this book

Add to My Shelf

Comprehensive detection of structural variations in long and short reads dataset of French cattle

by Marcuzzo, Camille , Suin, Amandine , Faraut, Thomas in 631/114 , 631/1647 , 631/208

2025

Structural variants (SVs) correspond to different types of genomic variants larger than 50 bp. Many findings suggest the use of long-read (LR) rather than short-read (SR) sequencing to improve the accuracy of SVs detection. Here, we present the results of an in-depth analysis for detection of SVs, mainly large insertions and deletions, in 14 French bovine breeds, based on whole-genome sequence (WGS) data comprising 176 LR and 571 SR samples, with 154 individuals having both LR and SR data available. We first investigated possible biases on the performances of well-known SVs detection tools, namely CUTESV, PBSV, and SNIFFLES, using LR from different technologies, including PacBio HiFi, Oxford ONT, and PacBio CLR. We subsequently highlighted the abilities of tools for detecting SVs (DELLY, LUMPY, and MANTA) and for genotyping known SVs (GRAPHTYPER, SVTYPER, PARAGRAPH, and VG toolkit) using SR data. We then show how the incremental composition of samples in the reference panel affected the SVs genotyping for six validation individuals sequenced in SR. We then searched for the optimal parameters and created the final SVs reference panel consisting of 25,191 deletions and 30,118 insertions. Finally, we emphasized the landscape of the genotyped SVs segregating across 571 SR individuals of 14 breeds.

Journal Article

Share this book

Add to My Shelf

VarGoats project: a dataset of 1159 whole-genome sequences to dissect Capra hircus global diversity

by Talouarn, Estelle , Faraut, Thomas , Bardou, Philippe in Africa , Agriculture , Animal biology

2021

Background Since their domestication 10,500 years ago, goat populations with distinctive genetic backgrounds have adapted to a broad variety of environments and breeding conditions. The VarGoats project is an international 1000-genome resequencing program designed to understand the consequences of domestication and breeding on the genetic diversity of domestic goats and to elucidate how speciation and hybridization have modeled the genomes of a set of species representative of the genus Capra . Findings A dataset comprising 652 sequenced goats and 507 public goat sequences, including 35 animals representing eight wild species, has been collected worldwide. We identified 74,274,427 single nucleotide polymorphisms (SNPs) and 13,607,850 insertion-deletions (InDels) by aligning these sequences to the latest version of the goat reference genome (ARS1). A Neighbor-joining tree based on Reynolds genetic distances showed that goats from Africa, Asia and Europe tend to group into independent clusters. Because goat breeds from Oceania and Caribbean (Creole) all derive from imported animals, they are distributed along the tree according to their ancestral geographic origin. Conclusions We report on an unprecedented international effort to characterize the genome-wide diversity of domestic goats. This large range of sequenced individuals represents a unique opportunity to ascertain how the demographic and selection processes associated with post-domestication history have shaped the diversity of this species. Data generated for the project will also be extremely useful to identify deleterious mutations and polymorphisms with causal effects on complex traits, and thus will contribute to new knowledge that could be used in genomic prediction and genome-wide association studies.

Journal Article

Share this book

Add to My Shelf

Reconstruction of monocotelydoneous proto-chromosomes reveals faster evolution in plants than in animals

by Close, Timothy J , Faraut, Thomas , Waugh, Robbie in Agricultural sciences , Angiosperms , Animals

2009

Paleogenomics seeks to reconstruct ancestral genomes from the genes of today's species. The characterization of paleo-duplications represented by 11,737 orthologs and 4,382 paralogs identified in five species belonging to three of the agronomically most important subfamilies of grasses, that is, Ehrhartoideae (rice) Panicoideae (sorghum, maize), and Pooideae (wheat, barley), permitted us to propose a model for an ancestral genome with a minimal size of 33.6 Mb structured in five proto-chromosomes containing at least 9,138 predicted proto-genes. It appears that only four major evolutionary shuffling events (α, β, γ, and δ) explain the divergence of these five cereal genomes during their evolution from a common paleo-ancestor. Comparative analysis of ancestral gene function with rice as a reference indicated that five categories of genes were preferentially modified during evolution. Furthermore, alignments between the five grass proto-chromosomes and the recently identified seven eudicot proto-chromosomes indicated that additional very active episodes of genome rearrangements and gene mobility occurred during angiosperm evolution. If one compares the pace of primate evolution of 90 million years (233 species) to 60 million years of the Poaceae (10,000 species), change in chromosome structure through speciation has accelerated significantly in plants.

Journal Article

Share this book

Add to My Shelf

A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling

by Faraut, Thomas , Di Franco, Arnaud , Zytnicki, Matthias in 631/114/2785/2302 , 631/1647/514/2254 , 631/208/1348

2023

Inspired by the production of reference data sets in the Genome in a Bottle project, we sequenced one Charolais heifer with different technologies: Illumina paired-end, Oxford Nanopore, Pacific Biosciences (HiFi and CLR), 10X Genomics linked-reads, and Hi-C. In order to generate haplotypic assemblies, we also sequenced both parents with short reads. From these data, we built two haplotyped trio high quality reference genomes and a consensus assembly, using up-to-date software packages. The assemblies obtained using PacBio HiFi reaches a size of 3.2 Gb, which is significantly larger than the 2.7 Gb ARS-UCD1.2 reference. The BUSCO score of the consensus assembly reaches a completeness of 95.8%, among highly conserved mammal genes. We also identified 35,866 structural variants larger than 50 base pairs. This assembly is a contribution to the bovine pangenome for the “Charolais” breed. These datasets will prove to be useful resources enabling the community to gain additional insight on sequencing technologies for applications such as SNP, indel or structural variant calling, and de novo assembly.

Journal Article

Share this book

Add to My Shelf

Detecting long tandem duplications in genomic sequences

by Audemard, Eric , Faraut, Thomas , Schiex, Thomas in Algorithms , Analysis , Arabidopsis - genetics

2012

Background Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. Results In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome , using a reference set of tandem duplicated genes built using TAIR, a we show that ReD Tandem is able to predict a large fraction of recently duplicated genes ( dS < 1) and that it is also able to predict tandem duplications involving non coding elements such as pseudo-genes or RNA genes. Conclusions ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations.

Journal Article

Share this book

Add to My Shelf

A duck RH panel and its potential for assisting NGS genome assembly

by Faraut, Thomas , Li, Ning , Bardes, Suzanne in Animal behavior , Animal Genetics and Genomics , Animal species

2012

Background Owing to the low cost of the high throughput Next Generation Sequencing (NGS) technology, more and more species have been and will be sequenced. However, de novo assemblies of large eukaryotic genomes thus produced are composed of a large number of contigs and scaffolds of medium to small size, having no chromosomal assignment. Radiation hybrid (RH) mapping is a powerful tool for building whole genome maps and has been used for several animal species, to help assign sequence scaffolds to chromosomes and determining their order. Results We report here a duck whole genome RH panel obtained by fusing female duck embryonic fibroblasts irradiated at a dose of 6,000 rads, with HPRT-deficient Wg3hCl 2 hamster cells. The ninety best hybrids, having an average retention of 23.6% of the duck genome, were selected for the final panel. To allow the genotyping of large numbers of markers, as required for whole genome mapping, without having to cultivate the hybrid clones on a large scale, three different methods involving Whole Genome Amplification (WGA) and/or scaling down PCR volumes by using the Fluidigm BioMark TM Integrated Fluidic Circuits (IFC) Dynamic Array TM for genotyping were tested. RH maps of APL12 and APL22 were built, allowing the detection of intrachromosomal rearrangements when compared to chicken. Finally, the panel proved useful for checking the assembly of sequence scaffolds and for mapping EST located on one of the smallest microchromosomes. Conclusion The Fluidigm BioMark TM Integrated Fluidic Circuits (IFC) Dynamic Array TM genotyping by quantitative PCR provides a rapid and cost-effective method for building RH linkage groups. Although the vast majority of genotyped markers exhibited a picture coherent with their associated scaffolds, a few of them were discordant, pinpointing potential assembly errors. Comparative mapping with chicken chromosomes GGA21 and GGA11 allowed the detection of the first chromosome rearrangements on microchromosomes between duck and chicken. As in chicken, the smallest duck microchromosomes appear missing in the assembly and more EST data will be needed for mapping them. Altogether, this underlines the added value of RH mapping to improve genome assemblies.

Journal Article

Share this book

Add to My Shelf

Genetic and Haplotypic Structure in 14 European and African Cattle Breeds

by Gautier, Mathieu , Faraut, Thomas , Boichard, Didier in Africa , Animals , Artificial insemination

2007

To evaluate and compare the extent of LD in cattle, 1536 SNPs, mostly localized on BTA03, were detected in silico from available sequence data using two different methods and genotyped on samples from 14 distinct breeds originating from Europe and Africa. Only 696 SNPs could be validated, confirming the importance of trace-quality information for the in silico detection. Most of the validated SNPs were informative in several breeds and were used for a detailed description of their genetic structure and relationships. Results obtained were in agreement with previous studies performed on microsatellite markers and using larger samples. In addition, the majority of the validated SNPs could be mapped precisely, reaching an average density of one marker every 311 kb. This allowed us to analyze the extent of LD in the different breeds. Decrease of LD with physical distance across breeds revealed footprints of ancestral LD at short distances (<10 kb). As suggested by the haplotype block structure, these ancestral blocks are organized, within a breed, into larger blocks of a few hundred kilobases. In practice, such a structure similar to that already reported in dogs makes it possible to develop a chip of <300,000 SNPs, which should be efficient for mapping purposes in most cattle breeds.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter