Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
131 result(s) for "Calus, Mario"
Sort by:
Estimation of inbreeding using pedigree, 50k SNP chip genotypes and full sequence data in three cattle breeds
Background Levels of inbreeding in cattle populations have increased in the past due to the use of a limited number of bulls for artificial insemination. High levels of inbreeding lead to reduced genetic diversity and inbreeding depression. Various estimators based on different sources, e.g., pedigree or genomic data, have been used to estimate inbreeding coefficients in cattle populations. However, the comparative advantage of using full sequence data to assess inbreeding is unknown. We used pedigree and genomic data at different densities from 50k to full sequence variants to compare how different methods performed for the estimation of inbreeding levels in three different cattle breeds. Results Five different estimates for inbreeding were calculated and compared in this study: pedigree based inbreeding coefficient (FPED); run of homozygosity (ROH)-based inbreeding coefficients (FROH); genomic relationship matrix (GRM)-based inbreeding coefficients (FGRM); inbreeding coefficients based on excess of homozygosity (FHOM) and correlation of uniting gametes (FUNI). Estimates using ROH provided the direct estimated levels of autozygosity in the current populations and are free effects of allele frequencies and incomplete pedigrees which may increase in inaccuracy in estimation of inbreeding. The highest correlations were observed between FROH estimated from the full sequence variants and the FROH estimated from 50k SNP (single nucleotide polymorphism) genotypes. The estimator based on the correlation between uniting gametes (FUNI) using full genome sequences was also strongly correlated with FROH detected from sequence data. Conclusions Estimates based on ROH directly reflected levels of homozygosity and were not influenced by allele frequencies, unlike the three other estimates evaluated (FGRM, FHOM and FUNI), which depended on estimated allele frequencies. FPED suffered from limited pedigree depth. Marker density affects ROH estimation. Detecting ROH based on 50k chip data was observed to give estimates similar to ROH from sequence data. In the absence of full sequence data ROH based on 50k can be used to access homozygosity levels in individuals. However, genotypes denser than 50k are required to accurately detect short ROH that are most likely identical by descent (IBD).
Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding
Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade.
Breeding Top Genotypes and Accelerating Response to Recurrent Selection by Selecting Parents with Greater Gametic Variance
Abstract Because of variation in linkage phase and heterozygosity among individuals, some individuals produce genetically more variable gametes than others. With the availability of genomic EBVs (GEBVs) or estimates of SNP-effects together with phased genotypes, differences in gametic variability can be quantified by simulating a set of virtual gametes of each selection candidate. Previous results in dairy cattle show that gametic variance can be large. Here, we show that breeders can increase the probability of breeding a top-ranking genotype and response to recurrent selection by selecting parents that produce more variable gametes, using the index I=GEBV+2xpSDgGEBV, where xp is the standardized normal truncation point belonging to selected proportion p, and SDgGEBV is the SD of the GEBV of an individual’s gametes. Benefits of the index were considerably larger in an ongoing selection program with equilibrium genetic parameters than in an initially unselected population. Superiority of the index over selection on GEBV increased strongly with the magnitude of the SDgGEBV, indicating that benefits of the index may vary considerably among populations. Compared to selection on ordinary GEBV, the probability of breeding a top-ranking individual can be increased by ∼36%, and response to selection by ∼3.6% when selection is strong (P = 0.001) based on values for the Holstein-Friesian dairy cattle population. Two-stage selection, with a preselection on GEBV and a final selection on the index, considerably reduced computational requirements with little loss of benefits. Response to multiple generations of selection and inheritance of the SDgEBV require further study.
Genetic parameters, reciprocal cross differences, and age-related heterosis of egg-laying performance in chickens
Background Egg-laying performance is economically important in poultry breeding programs. Crossbreeding between indigenous and elite commercial lines to exploit heterosis has been an upward trend in traditional layer breeding for niche markets. The objective of this study was to analyse the genetic background and to estimate the heterosis of longitudinal egg-laying traits in reciprocal crosses between an indigenous Beijing-You and an elite commercial White Leghorn layer line. Egg weights were measured for the first three eggs, monthly from 28 to 76 weeks of age, and at 86 and 100 weeks of age. Egg quality traits were measured at 32, 54, 72, 86, and 100 weeks of age. Egg production traits were measured from the start of lay until 43, 72, and 100 weeks of age. Heritabilities and phenotypic and genetic correlations were estimated. Heterosis was estimated as the percentage difference of performance of a crossbred from that of the parental average. Reciprocal cross differences were estimated as the difference between the reciprocal crossbreds as a percentage of the parental average. Results Estimates of heritability of egg weights ranged from 0.29 to 0.75. Estimates of genetic correlations between egg weights at different ages ranged from 0.72 to 1.00. Estimates of heritability for cumulative egg numbers until 43, 72, and 100 weeks of age were around 0.15. Estimates of heterosis for egg weight and cumulative egg number increased with age, ranging from 1.0 to 9.0% and from 1.4 to 11.6%, respectively. From 72 to 100 weeks of age, crossbreds produced more eggs per week than the superior parent White Leghorn (3.5 eggs for White Leghorn, 3.8 and 3.9 eggs for crossbreds). Heterosis for eggshell thickness ranged from 2.7 to 6.6% when using Beijing-You as the sire breed. No significant difference between reciprocal crosses was observed for the investigated traits, except for eggshell strength at 54 weeks of age. Conclusions The heterosis was substantial for egg weight and cumulative egg number, and increased with age, suggesting that non-additive genetic effects are important in crossbreds between the indigenous and elite breeds. Generally, the crossbreds performed similar to or even outperformed the commercial White Leghorns for egg production persistency.
Accuracy of multi-trait genomic selection using different methods
Background Genomic selection has become a very important tool in animal genetics and is rapidly emerging in plant genetics. It holds the promise to be particularly beneficial to select for traits that are difficult or expensive to measure, such as traits that are measured in one environment and selected for in another environment. The objective of this paper was to develop three models that would permit multi-trait genomic selection by combining scarcely recorded traits with genetically correlated indicator traits, and to compare their performance to single-trait models, using simulated datasets. Methods Three (SNP) Single Nucleotide Polymorphism based models were used. Model G and BCπ0 assumed that contributed (co)variances of all SNP are equal. Model BSSVS sampled SNP effects from a distribution with large (or small) effects to model SNP that are (or not) associated with a quantitative trait locus. For reasons of comparison, model A including pedigree but not SNP information was fitted as well. Results In terms of accuracies for animals without phenotypes, the models generally ranked as follows: BSSVS > BCπ0 > G > > A. Using multi-trait SNP-based models, the accuracy for juvenile animals without any phenotypes increased up to 0.10. For animals with phenotypes on an indicator trait only, accuracy increased up to 0.03 and 0.14, for genetic correlations with the evaluated trait of 0.25 and 0.75, respectively. Conclusions When the indicator trait had a genetic correlation lower than 0.5 with the trait of interest in our simulated data, the accuracy was higher if genotypes rather than phenotypes were obtained for the indicator trait. However, when genetic correlations were higher than 0.5, using an indicator trait led to higher accuracies for selection candidates. For different combinations of traits, the level of genetic correlation below which genotyping selection candidates is more effective than obtaining phenotypes for an indicator trait, needs to be derived considering at least the heritabilities and the numbers of animals recorded for the traits involved.
Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein–Friesian cattle
Background Whole-genome sequence data is expected to capture genetic variation more completely than common genotyping panels. Our objective was to compare the proportion of variance explained and the accuracy of genomic prediction by using imputed sequence data or preselected SNPs from a genome-wide association study (GWAS) with imputed whole-genome sequence data. Methods Phenotypes were available for 5503 Holstein–Friesian bulls. Genotypes were imputed up to whole-genome sequence (13,789,029 segregating DNA variants) by using run 4 of the 1000 bull genomes project. The program GCTA was used to perform GWAS for protein yield (PY), somatic cell score (SCS) and interval from first to last insemination (IFL). From the GWAS, subsets of variants were selected and genomic relationship matrices (GRM) were used to estimate the variance explained in 2087 validation animals and to evaluate the genomic prediction ability. Finally, two GRM were fitted together in several models to evaluate the effect of selected variants that were in competition with all the other variants. Results The GRM based on full sequence data explained only marginally more genetic variation than that based on common SNP panels: for PY, SCS and IFL, genomic heritability improved from 0.81 to 0.83, 0.83 to 0.87 and 0.69 to 0.72, respectively. Sequence data also helped to identify more variants linked to quantitative trait loci and resulted in clearer GWAS peaks across the genome. The proportion of total variance explained by the selected variants combined in a GRM was considerably smaller than that explained by all variants (less than 0.31 for all traits). When selected variants were used, accuracy of genomic predictions decreased and bias increased. Conclusions Although 35 to 42 variants were detected that together explained 13 to 19% of the total variance (18 to 23% of the genetic variance) when fitted alone, there was no advantage in using dense sequence information for genomic prediction in the Holstein data used in our study. Detection and selection of variants within a single breed are difficult due to long-range linkage disequilibrium. Stringent selection of variants resulted in more biased genomic predictions, although this might be due to the training population being the same dataset from which the selected variants were identified.
Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking
The genomic prediction of phenotypes and breeding values in animals and plants has developed rapidly into its own research field. Results of genomic prediction studies are often difficult to compare because data simulation varies, real or simulated data are not fully described, and not all relevant results are reported. In addition, some new methods have been compared only in limited genetic architectures, leading to potentially misleading conclusions. In this article we review simulation procedures, discuss validation and reporting of results, and apply benchmark procedures for a variety of genomic prediction methods in simulated and real example data. Plant and animal breeding programs are being transformed by the use of genomic data, which are becoming widely available and cost-effective to predict genetic merit. A large number of genomic prediction studies have been published using both simulated and real data. The relative novelty of this area of research has made the development of scientific conventions difficult with regard to description of the real data, simulation of genomes, validation and reporting of results, and forward in time methods. In this review article we discuss the generation of simulated genotype and phenotype data, using approaches such as the coalescent and forward in time simulation. We outline ways to validate simulated data and genomic prediction results, including cross-validation. The accuracy and bias of genomic prediction are highlighted as performance indicators that should be reported. We suggest that a measure of relatedness between the reference and validation individuals be reported, as its impact on the accuracy of genomic prediction is substantial. A large number of methods were compared in example simulated and real (pine and wheat) data sets, all of which are publicly available. In our limited simulations, most methods performed similarly in traits with a large number of quantitative trait loci (QTL), whereas in traits with fewer QTL variable selection did have some advantages. In the real data sets examined here all methods had very similar accuracies. We conclude that no single method can serve as a benchmark for genomic prediction. We recommend comparing accuracy and bias of new methods to results from genomic best linear prediction and a variable selection approach (e.g., BayesB), because, together, these methods are appropriate for a range of genetic architectures. An accompanying article in this issue provides a comprehensive review of genomic prediction methods and discusses a selection of topics related to application of genomic prediction in plants and animals.
SNPrune: an efficient algorithm to prune large SNP array and sequence datasets based on high linkage disequilibrium
Background High levels of pairwise linkage disequilibrium (LD) in single nucleotide polymorphism (SNP) array or whole-genome sequence data may affect both performance and efficiency of genomic prediction models. Thus, this warrants pruning of genotyping data for high LD. We developed an algorithm, named SNPrune, which enables the rapid detection of any pair of SNPs in complete or high LD throughout the genome. Methods LD, measured as the squared correlation between phased alleles ( r 2 ), can only reach a value of 1 when both loci have the same count of the minor allele. Sorting loci based on the minor allele count, followed by comparison of their alleles, enables rapid detection of loci in complete LD. Detection of loci in high LD can be optimized by computing the range of the minor allele count at another locus for each possible value of the minor allele count that can yield LD values higher than a predefined threshold. This efficiently reduces the number of pairs of loci for which LD needs to be computed, instead of considering all pairwise combinations of loci. The implemented algorithm SNPrune considered bi-allelic loci either using phased alleles or allele counts as input. SNPrune was validated against PLINK on two datasets, using an r 2 threshold of 0.99. The first dataset contained 52k SNP genotypes on 3534 pigs and the second dataset contained simulated whole-genome sequence data with 10.8 million SNPs and 2500 animals. Results SNPrune removed a similar number of SNPs as PLINK from the pig data but SNPrune was almost 12 times faster than PLINK. From the simulated sequence data with 10.8 million SNPs, SNPrune removed 6.4 and 1.4 million SNPs due to complete and high LD. Results were very similar regardless of whether phased alleles or allele counts were used. Using allele counts and multi-threading with 10 threads, SNPrune completed the analysis in 21 min. Using a sliding window of up to 500,000 SNPs, PLINK removed ~ 43,000 less SNPs (0.6%) in the sequence data and SNPrune was 24 to 170 times faster, using one or ten threads, respectively. Conclusions The SNPrune algorithm developed here is able to remove SNPs in high LD throughout the genome very efficiently in large datasets.
International single-step SNPBLUP beef cattle evaluations for Limousin weaning weight
Background Compared to national evaluations, international collaboration projects further improve accuracies of estimated breeding values (EBV) by building larger reference populations or performing a joint evaluation using data (or proxy of them) from different countries. Genomic selection is increasingly adopted in beef cattle, but, to date, the benefits of including genomic information in international evaluations have not been explored. Our objective was to develop an international beef cattle single-step genomic evaluation and investigate its impact on the accuracy and bias of genomic evaluations compared to current pedigree-based evaluations. Methods Weaning weight records were available for 331,593 animals from seven European countries. The pedigree included 519,740 animals. After imputation and quality control, 17,607 genotypes at a density of 57,899 single nucleotide polymorphisms (SNPs) from four countries were available. We implemented two international scenarios where countries were modelled as different correlated traits: an international genomic single-step SNP best linear unbiased prediction (SNPBLUP) evaluation (ssSNPBLUP INT ) and an international pedigree-based BLUP evaluation (PBLUP INT ). Two national scenarios were implemented for pedigree and genomic evaluations using only nationally submitted phenotypes and genotypes. Accuracies, level and dispersion bias of EBV of animals born from 2014 onwards, and increases in population accuracies were estimated using the linear regression method. Results On average across countries, 39 and 17% of sires and maternal-grand-sires with recorded (grand-)offspring across two countries were genotyped. ssSNPBLUP INT showed the highest accuracies of EBV and, compared to PBLUP INT , led to increases in population accuracy of 13.7% for direct EBV, and 25.8% for maternal EBV, on average across countries. Increases in population accuracies when moving from national scenarios to ssSNPBLUP INT were observed for all countries. Overall, ssSNPBLUP INT level and dispersion bias remained similar or slightly reduced compared to PBLUP INT and national scenarios. Conclusions International single-step SNPBLUP evaluations are feasible and lead to higher population accuracies for both large and small countries compared to current international pedigree-based evaluations and national evaluations. These results are likely related to the larger multi-country reference population and the inclusion of phenotypes from relatives recorded in other countries via single-step international evaluations. The proposed international single-step approach can be applied to other traits and breeds.
Computational strategies for the preconditioned conjugate gradient method applied to ssSNPBLUP, with an application to a multivariate maternal model
Background The single-step single nucleotide polymorphism best linear unbiased prediction (ssSNPBLUP) is one of the single-step evaluations that enable a simultaneous analysis of phenotypic and pedigree information of genotyped and non-genotyped animals with a large number of genotypes. The aim of this study was to develop and illustrate several computational strategies to efficiently solve different ssSNPBLUP systems with a large number of genotypes on current computers. Results The different developed strategies were based on simplified computations of some terms of the preconditioner, and on splitting the coefficient matrix of the different ssSNPBLUP systems into multiple parts to perform its multiplication by a vector more efficiently. Some matrices were computed explicitly and stored in memory (e.g. the inverse of the pedigree relationship matrix), or were stored using a compressed form (e.g. the Plink 1 binary form for the genotype matrix), to permit the use of efficient parallel procedures while limiting the required amount of memory. The developed strategies were tested on a bivariate genetic evaluation for livability of calves for the Netherlands and the Flemish region in Belgium. There were 29,885,286 animals in the pedigree, 25,184,654 calf records, and 131,189 genotyped animals. The ssSNPBLUP system required around 18 GB Random Access Memory and 12 h to be solved with the most performing implementation. Conclusions Based on our proposed approaches and results, we showed that ssSNPBLUP provides a feasible approach in terms of memory and time requirements to estimate genomic breeding values using current computers.