Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
41
result(s) for
"Li, Jeremiah H."
Sort by:
Comparing low-pass sequencing and genotyping for trait mapping in pharmacogenetics
by
Wasik, Kaja
,
Fraser, Dana J.
,
Li, Jeremiah H.
in
Animal Genetics and Genomics
,
Antigens
,
Arrays
2021
Background
Low pass sequencing has been proposed as a cost-effective alternative to genotyping arrays to identify genetic variants that influence multifactorial traits in humans. For common diseases this typically has required both large sample sizes and comprehensive variant discovery. Genotyping arrays are also routinely used to perform pharmacogenetic (PGx) experiments where sample sizes are likely to be significantly smaller, but clinically relevant effect sizes likely to be larger.
Results
To assess how low pass sequencing would compare to array based genotyping for PGx we compared a low-pass assay (in which 1x coverage or less of a target genome is sequenced) along with software for genotype imputation to standard approaches. We sequenced 79 individuals to 1x genome coverage and genotyped the same samples on the Affymetrix Axiom Biobank Precision Medicine Research Array (PMRA). We then down-sampled the sequencing data to 0.8x, 0.6x, and 0.4x coverage, and performed imputation. Both the genotype data and the sequencing data were further used to impute human leukocyte antigen (HLA) genotypes for all samples. We compared the sequencing data and the genotyping array data in terms of four metrics: overall concordance, concordance at single nucleotide polymorphisms in pharmacogenetics-related genes, concordance in imputed HLA genotypes, and imputation r
2
. Overall concordance between the two assays ranged from 98.2% (for 0.4x coverage sequencing) to 99.2% (for 1x coverage sequencing), with qualitatively similar numbers for the subsets of variants most important in pharmacogenetics. At common single nucleotide polymorphisms (SNPs), the mean imputation r
2
from the genotyping array was 0.90, which was comparable to the imputation r
2
from 0.4x coverage sequencing, while the mean imputation r
2
from 1x sequencing data was 0.96.
Conclusions
These results indicate that low-pass sequencing to a depth above 0.4x coverage attains higher power for association studies when compared to the PMRA and should be considered as a competitive alternative to genotyping arrays for trait mapping in pharmacogenetics.
Journal Article
Genotype inference from aggregated chromatin accessibility data reveals genetic regulatory mechanisms
by
Voight, Benjamin F.
,
Li, Jeremiah H.
,
Wenz, Brandon M.
in
Accuracy
,
Animal Genetics and Genomics
,
Bioinformatics
2025
Background
Understanding the genetic causes underlying variability in chromatin accessibility can shed light on the molecular mechanisms through which genetic variants may affect complex traits. Thousands of ATAC-seq samples have been collected that hold information about chromatin accessibility across diverse cell types and contexts, but most of these are not paired with genetic information and come from distinct projects and laboratories.
Results
We report here joint genotyping, chromatin accessibility peak calling, and discovery of quantitative trait loci which influence chromatin accessibility (caQTLs), demonstrating the capability of performing caQTL analysis on a large scale in a diverse sample set without pre-existing genotype information. Using 10,293 profiling samples representing 1454 unique donor individuals across 653 studies from public databases, we catalog 24,159 caQTLs in total. After joint discovery analysis, we cluster samples based on accessible chromatin profiles to identify context-specific caQTLs. We find that caQTLs are strongly enriched for annotations of gene regulatory elements across diverse cell types and tissues and are often linked with genetic variation associated with changes in expression (eQTLs), indicating that caQTLs can mediate genetic effects on gene expression. We demonstrate sharing of causal variants for chromatin accessibility across human traits, enabling a more complete picture of the genetic mechanisms underlying complex human phenotypes.
Conclusions
Our work provides a proof of principle for caQTL calling from previously ungenotyped samples and represents one of the largest, most diverse caQTL resources currently available, informing mechanisms of genetic regulation of gene expression and contribution to disease.
Journal Article
Low-pass sequencing plus imputation using avidity sequencing displays comparable imputation accuracy to sequencing by synthesis while reducing duplicates
2024
Low-pass sequencing with genotype imputation has been adopted as a cost-effective method for genotyping. The most widely used method of short-read sequencing uses sequencing by synthesis (SBS). Here we perform a study of a novel sequencing technology—avidity sequencing. In this short note, we compare the performance of imputation from low-pass libraries sequenced on an Element AVITI system (which utilizes avidity sequencing) to those sequenced on an Illumina NovaSeq 6000 (which utilizes SBS) with an SP flow cell for the same set of biological samples across a range of genetic ancestries. We observed dramatically lower optical duplication rates in the data deriving from the AVITI system compared to the NovaSeq 6000, resulting in higher effective coverage given a fixed number of sequenced bases, and comparable imputation accuracy performance between sequencing chemistries across ancestries. This study demonstrates that avidity sequencing is a viable alternative to the standard SBS chemistries for applications involving low-pass sequencing plus imputation.
Journal Article
Assessment of Imputation from Low-Pass Sequencing to Predict Merit of Beef Steers
by
Li, Jeremiah H.
,
Hoff, Jesse L.
,
Snelling, Warren M.
in
Animal Husbandry - methods
,
Animals
,
beef
2020
Decreasing costs are making low coverage sequencing with imputation to a comprehensive reference panel an attractive alternative to obtain functional variant genotypes that can increase the accuracy of genomic prediction. To assess the potential of low-pass sequencing, genomic sequence of 77 steers sequenced to >10X coverage was downsampled to 1X and imputed to a reference of 946 cattle representing multiple Bos taurus and Bos indicus-influenced breeds. Genotypes for nearly 60 million variants detected in the reference were imputed from the downsampled sequence. The imputed genotypes strongly agreed with the SNP array genotypes (r¯=0.99) and the genotypes called from the transcript sequence (r¯=0.97). Effects of BovineSNP50 and GGP-F250 variants on birth weight, postweaning gain, and marbling were solved without the steers’ phenotypes and genotypes, then applied to their genotypes, to predict the molecular breeding values (MBV). The steers’ MBV were similar when using imputed and array genotypes. Replacing array variants with functional sequence variants might allow more robust MBV. Imputation from low coverage sequence offers a viable, low-cost approach to obtain functional variant genotypes that could improve genomic prediction.
Journal Article
Genotype inference from aggregated chromatin accessibility data reveals genetic regulatory mechanisms
2024
Understanding the genetic causes for variability in chromatin accessibility can shed light on the molecular mechanisms through which genetic variants may affect complex traits. Thousands of ATAC-seq samples have been collected that hold information about chromatin accessibility across diverse cell types and contexts, but most of these are not paired with genetic information and come from diverse distinct projects and laboratories.
We report here joint genotyping, chromatin accessibility peak calling, and discovery of quantitative trait loci which influence chromatin accessibility (caQTLs), demonstrating the capability of performing caQTL analysis on a large scale in a diverse sample set without pre-existing genotype information. Using 10,293 profiling samples representing 1,454 unique donor individuals across 653 studies from public databases, we catalog 23,381 caQTLs in total. After joint discovery analysis, we cluster samples based on accessible chromatin profiles to identify context-specific caQTLs. We find that caQTLs are strongly enriched for annotations of gene regulatory elements across diverse cell types and tissues and are often strongly linked with genetic variation associated with changes in expression (eQTLs), indicating that caQTLs can mediate genetic effects on gene expression. We demonstrate sharing of causal variants for chromatin accessibility and diverse complex human traits, enabling a more complete picture of the genetic mechanisms underlying complex human phenotypes.
Our work provides a proof of principle for caQTL calling from previously ungenotyped samples, and represents one of the largest, most diverse caQTL resources currently available, informing mechanisms of genetic regulation of gene expression and contribution to disease.
Journal Article
A comparison between low-cost library preparation kits for low coverage sequencing
2024
In the fields of human health and agricultural research, low coverage whole-genome sequencing followed by imputation to a large haplotype reference panel has emerged as a cost-effective alternative to genotyping arrays for assaying large numbers of samples. However, a systematic comparison of library preparation methods tailored for low coverage sequencing remains absent in the existing literature. In this study, we evaluated one full sized kit from IDT and miniaturized and evaluated three Illumina-compatible library preparation kits-the KAPA HyperPlus kit (Roche), the DNA Prep kit (Illumina), and an IDT kit-using 96 human DNA samples. Metrics evaluated included imputation concordance with high-depth genotypes, coverage, duplication rates, time for library preparation, and additional optimization requirements. Despite slightly elevated duplication rates in IDT kits, we find that all four kits perform well in terms of imputation accuracy, with IDT kits being only marginally less performant than Illumina and Roche kits. Laboratory handling of the kits was similar: thus, the choice of a kit will largely depend on (1) existing or planned infrastructure, such as liquid handling capabilities, (2) whether a specific characteristic is desired, such as the use of full-length adapters, shorter processing times, or (3) use case, for instance, long vs short read sequencing. Our findings offer a comprehensive resource for both commercial and research workflows of low-cost library preparation methods suitable for high-throughput low coverage whole genome sequencing.Competing Interest StatementCMS, MJSG, JYP, & JHL were all employees of Gencove, Inc. at the time of writing.
Low-pass sequencing increases the power of GWAS and decreases measurement error of polygenic risk scores compared to genotyping arrays
by
Berisa, Tomaz
,
Mazur, Chase A
,
Li, Jeremiah H
in
Breast cancer
,
Cardiovascular disease
,
Coronary artery
2021
Abstract Low-pass sequencing (sequencing a genome to an average depth less than 1 coverage) combined with genotype imputation has been proposed as an alternative to genotyping arrays for trait mapping and calculation of polygenic scores. To empirically assess the relative performance of these technologies for different applications, we performed low-pass sequencing (targeting coverage levels of 0.5× and 1×) and array genotyping (using the Illumina Global Screening Array (GSA)) on 120 DNA samples derived from African and European-ancestry individuals that are part of the 1000 Genomes Project. We then imputed both the sequencing data and the genotyping array data to the 1000 Genomes Phase 3 haplotype reference panel using a leave-one-out design. We evaluated overall imputation accuracy from these different assays as well as overall power for GWAS from imputed data, and computed polygenic risk scores for coronary artery disease and breast cancer using previously derived weights. We conclude that low-pass sequencing plus imputation, in addition to providing a substantial increase in statistical power for genome wide association studies, provides increased accuracy for polygenic risk prediction at effective coverages of ~ 0.5× and higher compared to the Illumina GSA. Competing Interest Statement J.H.L., C.A.M., T.B., and J.K.P. were employees of Gencove, Inc. at the time of writing. Footnotes * Revisions made. * https://gencove-sbir.s3.amazonaws.com/index.html * https://gitlab.com/gencove/loimpute-public
Relative matching using low coverage sequencing
2020
Abstract Finding familial relatives using DNA has multiple applications, in genetic genealogy, population genetics, and forensics. So far, most relative matching algorithms rely on detecting identity-by-descent (IBD) segments with high quality genotype data. Recently, low coverage sequencing (LCS) has received growing attention as a promising cost-effective method to ascertain genomic information. However, with higher error rates, it is unclear whether existing IBD detection can work on LCS datasets. Here, we developed and tested a framework for relative matching using sequencing with 1× coverage (1×LCS). We started by exploring the error characteristics of this method compared to array data. Our results show that after some optimization 1×LCS can exhibit the same genotyping discordance rates as the discordance between two array platforms. Using this observation, we developed a hybrid framework for relative matching and tuned this framework with >2,700 pairs of confirmed genealogical relatives that were genotyped using heterogenous datasets. We then obtained array and 1×LCS on 19 samples and use our framework to find relatives in a database of over 3 million individuals. The total length of shared segments obtained by 1×LCS was virtually indistinguishable to genotyping arrays for matches with a total sharing >200cM (second cousins or closer). For more distant relatives, as long as those were detected by both technologies, the total length obtained by LCS and by genotyping arrays was highly correlated, with no evidence of over- or underestimation. Taken together, our results show that 1×LCS can be a valid alternative to arrays for relative matching, opening the possibility for further democratization of genomic data. Competing Interest Statement E.P. R.S., B.S., T.S., M.A., L.A., D.W.V, Y.N. O.N., and Y.E are employed by MyHeritage LTD. J.H.L, T.B., and J.K.P are employed by Gencove Inc. S.C. is a paid consultant of MyHeritage LTD. Footnotes * Acknowledgements updated.
Comparing low-pass sequencing and genotyping for trait mapping in pharmacogenetics
2019
Low pass sequencing has been proposed as a cost-effective alternative to genotyping arrays to identify genetic variants that influence multifactorial traits in humans. For common diseases this typically has required both large sample sizes and comprehensive variant discovery. Genotyping arrays are also routinely used to perform pharmacogenetic (PGx) experiments where sample sizes are likely to be significantly smaller, but clinically relevant effect sizes likely to be larger. To assess how low pass sequencing would compare to array based genotyping for PGx we compared a low-pass assay (in which 1x coverage or less of a target genome is sequenced) along with software for genotype imputation to standard approaches. We sequenced 79 individuals to 1x genome coverage and genotyped the same samples on the Affymetrix Axiom Biobank Precision Medicine Research Array (PMRA). We then down-sampled the sequencing data to 0.8x, 0.6x, and 0.4x coverage, and performed imputation. Both the genotype data and the sequencing data were further used to impute human leukocyte antigen (HLA) genotypes for all samples. We compared the sequencing data and the genotyping array data in terms of four metrics: overall concordance, concordance at single nucleotide polymorphisms in pharmacogenetics-related genes, concordance in imputed HLA genotypes, and imputation r2. Overall concordance between the two assays ranged from 98.2% (for 0.4x coverage sequencing) to 99.2% (for 1x coverage sequencing), with qualitatively similar numbers for the subsets of variants most important in pharmacogenetics. At common single nucleotide polymorphisms (SNPs), the mean imputation r2 from the genotyping array was 90%, which was comparable to the imputation r2 from 0.4x coverage sequencing, while the mean imputation r2 from 1x sequencing data was 96%. These results indicate that low-pass sequencing to a depth above 0.4x coverage attains higher power for trait mapping when compared to the PMRA.
Low-pass sequencing plus imputation using avidity sequencing displays comparable imputation accuracy to sequencing by synthesis while reducing duplicates
2022
Low-pass sequencing with genotype imputation has been adopted as a cost-effective method for genotyping. The most widely used method of short-read sequencing uses sequencing by synthesis (SBS). Here we perform a study of a novel sequencing technology — avidity sequencing. In this short note, we compare the performance of imputation from low-pass libraries sequenced on an Element AVITI system (which utilizes avidity sequencing) to those sequenced on an Illumina NovaSeq 6000 (which utilizes SBS) with an SP flow cell for the same set of biological samples across a range of genetic ancestries. We observed dramatically lower duplication rates in the data deriving from the AVITI system compared to the NovaSeq 6000, resulting in higher effective coverage given a fixed number of sequenced bases, and comparable imputation accuracy performance between sequencing chemistries across ancestries. This study demonstrates that avidity sequencing is a viable alternative to the standard SBS chemistries for applications involving low-pass sequencing plus imputation.