Catalogue Search | MBRL

Developing a 670k genotyping array to tag ~2M SNPs across 24 horse breeds

by Metzger, Julia , McCue, Molly E. , Bannasch, Danika L. in Agricultural and Veterinary Sciences , Animal and Dairy Science , Animal Genetics and Genomics

2017

Background To date, genome-scale analyses in the domestic horse have been limited by suboptimal single nucleotide polymorphism (SNP) density and uneven genomic coverage of the current SNP genotyping arrays. The recent availability of whole genome sequences has created the opportunity to develop a next generation, high-density equine SNP array. Results Using whole genome sequence from 153 individuals representing 24 distinct breeds collated by the equine genomics community, we cataloged over 23 million de novo discovered genetic variants. Leveraging genotype data from individuals with both whole genome sequence, and genotypes from lower-density, legacy SNP arrays, a subset of ~5 million high-quality, high-density array candidate SNPs were selected based on breed representation and uniform spacing across the genome. Considering probe design recommendations from a commercial vendor (Affymetrix, now Thermo Fisher Scientific) a set of ~2 million SNPs were selected for a next-generation high-density SNP chip (MNEc2M). Genotype data were generated using the MNEc2M array from a cohort of 332 horses from 20 breeds and a lower-density array, consisting of ~670 thousand SNPs (MNEc670k), was designed for genotype imputation. Conclusions Here, we document the steps taken to design both the MNEc2M and MNEc670k arrays, report genomic and technical properties of these genotyping platforms, and demonstrate the imputation capabilities of these tools for the domestic horse.

Journal Article

Share this book

Add to My Shelf

Cross-Comparison of Exome Analysis, Next-Generation Sequencing of Amplicons, and the iPLEX® ADME PGx Panel for Pharmacogenomic Profiling

by Ton, Kim N. T. , Shepherd, Phillip , Chua, Eng Wee in CYP2D6 protein , Cytochrome P450 , Drug therapy

2016

Whole-exome sequencing (WES) has been widely used for analysis of human genetic diseases, but its value for the pharmacogenomic profiling of individuals is not well studied. Initially, we performed an in-depth evaluation of the accuracy of WES variant calling in the pharmacogenes CYP2D6 and CYP2C19 by comparison with MiSeq(®) amplicon sequencing data (n = 36). This analysis revealed that the concordance rate between WES and MiSeq(®) was high, achieving 99.60% for variants that were called without exceeding the truth-sensitivity threshold (99%), defined during variant quality score recalibration (VQSR). Beyond this threshold, the proportion of discordant calls increased markedly. Subsequently, we expanded our findings beyond CYP2D6 and CYP2C19 to include more genes genotyped by the iPLEX(®) ADME PGx Panel in the subset of twelve samples. WES performed well, agreeing with the genotyping panel in approximately 99% of the selected pass-filter variant calls. Overall, our results have demonstrated WES to be a promising approach for pharmacogenomic profiling, with an estimated error rate of lower than 1%. Quality filters, particularly VQSR, are important for reducing the number of false variants. Future studies may benefit from examining the role of WES in the clinical setting for guiding drug therapy.

Journal Article

Share this book

Add to My Shelf

Impact of post-alignment processing in variant discovery from whole exome data

by Kalmbach, Michael , Yan, Huihuang , Slager, Susan L. in Accuracy , Algorithms , Bioinformatics

2016

Background GATK Best Practices workflows are widely used in large-scale sequencing projects and recommend post-alignment processing before variant calling. Two key post-processing steps include the computationally intensive local realignment around known INDELs and base quality score recalibration (BQSR). Both have been shown to reduce erroneous calls; however, the findings are mainly supported by the analytical pipeline that incorporates BWA and GATK UnifiedGenotyper. It is not known whether there is any benefit of post-processing and to what extent the benefit might be for pipelines implementing other methods, especially given that both mappers and callers are typically updated. Moreover, because sequencing platforms are upgraded regularly and the new platforms provide better estimations of read quality scores, the need for post-processing is also unknown. Finally, some regions in the human genome show high sequence divergence from the reference genome; it is unclear whether there is benefit from post-processing in these regions. Results We used both simulated and NA12878 exome data to comprehensively assess the impact of post-processing for five or six popular mappers together with five callers. Focusing on chromosome 6p21.3, which is a region of high sequence divergence harboring the human leukocyte antigen (HLA) system, we found that local realignment had little or no impact on SNP calling, but increased sensitivity was observed in INDEL calling for the Stampy + GATK UnifiedGenotyper pipeline. No or only a modest effect of local realignment was detected on the three haplotype-based callers and no evidence of effect on Novoalign. BQSR had virtually negligible effect on INDEL calling and generally reduced sensitivity for SNP calling that depended on caller, coverage and level of divergence. Specifically, for SAMtools and FreeBayes calling in the regions with low divergence, BQSR reduced the SNP calling sensitivity but improved the precision when the coverage is insufficient. However, in regions of high divergence (e.g., the HLA region), BQSR reduced the sensitivity of both callers with little gain in precision rate. For the other three callers, BQSR reduced the sensitivity without increasing the precision rate regardless of coverage and divergence level. Conclusions We demonstrated that the gain from post-processing is not universal; rather, it depends on mapper and caller combination, and the benefit is influenced further by sequencing depth and divergence level. Our analysis highlights the importance of considering these key factors in deciding to apply the computationally intensive post-processing to Illumina exome data.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter