Catalogue Search | MBRL

Detecting Structure of Haplotypes and Local Ancestry

by Guan, Yongtao in Chromosomes , Chromosomes, Human, Pair 6 , Chromosomes, Human, Pair 8

2014

We present a two-layer hidden Markov model to detect the structure of haplotypes for unrelated individuals. This allows us to model two scales of linkage disequilibrium (one within a group of haplotypes and one between groups), thereby taking advantage of rich haplotype information to infer local ancestry of admixed individuals. Our method outperforms competing state-of-the-art methods, particularly for regions of small ancestral track lengths. Applying our method to Mexican samples in HapMap3, we found two regions on chromosomes 6 and 8 that show significant departure of local ancestry from the genome-wide average. A software package implementing the methods described in this article is freely available at http://bcm.edu/cnrc/mcmcmc.

Journal Article

Share this book

Add to My Shelf

MVP predicts the pathogenicity of missense variants by deep learning

by Long, John J. , Qi, Hongjian , Chen, Chen in 45/23 , 631/114/1305 , 631/114/2184

2021

Accurate pathogenicity prediction of missense variants is critically important in genetic studies and clinical diagnosis. Previously published prediction methods have facilitated the interpretation of missense variants but have limited performance. Here, we describe MVP (Missense Variant Pathogenicity prediction), a new prediction method that uses deep residual network to leverage large training data sets and many correlated predictors. We train the model separately in genes that are intolerant of loss of function variants and the ones that are tolerant in order to take account of potentially different genetic effect size and mode of action. We compile cancer mutation hotspots and de novo variants from developmental disorders for benchmarking. Overall, MVP achieves better performance in prioritizing pathogenic missense variants than previous methods, especially in genes tolerant of loss of function variants. Finally, using MVP, we estimate that de novo coding variants contribute to 7.8% of isolated congenital heart disease, nearly doubling previous estimates. Accurate prediction of variant pathogenicity is essential to understanding genetic risks in disease. Here, the authors present a deep neural network method for prediction of missense variant pathogenicity, MVP, and demonstrate its utility in prioritizing de novo variants contributing to developmental disorders.

Journal Article

Share this book

Add to My Shelf

BAYESIAN VARIABLE SELECTION REGRESSION FOR GENOME-WIDE ASSOCIATION STUDIES AND OTHER LARGE-SCALE PROBLEMS

by Guan, Yongtao , Stephens, Matthew in association study , Bayesian regression , Datasets

2011

We consider applying Bayesian Variable Selection Regression, or BVSR, to genome-wide association studies and similar large-scale regression problems. Currently, typical genome-wide association studies measure hundreds of thousands, or millions, of genetic variants (SNPs), in thousands or tens of thousands of individuals, and attempt to identify regions harboring SNPs that affect some phenotype or outcome of interest. This goal can naturally be cast as a variable selection regression problem, with the SNPs as the covariates in the regression. Characteristic features of genome-wide association studies include the following: (i) a focus primarily on identifying relevant variables, rather than on prediction; and (ii) many relevant covariates may have tiny effects, making it effectively impossible to confidently identify the complete \"correct\" subset of variables. Taken together, these factors put a premium on having interpretable measures of confidence for individual covariates being included in the model, which we argue is a strength of BVSR compared with alternatives such as penalized regression methods. Here we focus primarily on analysis of quantitative phenotypes, and on appropriate prior specification for BVSR in this setting, emphasizing the idea of considering what the priors imply about the total proportion of variance in outcome explained by relevant covariates. We also emphasize the potential for BVSR to estimate this proportion of variance explained, and hence shed light on the issue of \"missing heritability\" in genome-wide association studies. More generally, we demonstrate that, despite the apparent computational challenges, BVSR can provide useful inferences in these large-scale problems, and in our simulations produces better power and predictive performance compared with standard single-SNP analyses and the penalized regression method LASSO. Methods described here are implemented in a software package, pi-MASS, available from the Guan Lab website http://bcm.edu/cnrc/mcmcmc/pimass.

Journal Article

Share this book

Add to My Shelf

A Composite Likelihood Approach in Fitting Spatial Point Process Models

by Guan, Yongtao in Applications , Asymptotic properties , Clefs

2006

We propose a new likelihood-based approach in fitting spatial point process models. A composite likelihood is first formed by adding some pairwise composite likelihood functions that are defined in terms of the second-order intensity function of the underlying process, and then used for estimating the unknown parameters. The estimation procedure is computationally simple and yields consistent and asymptotically normal estimators under some mild conditions. We demonstrate through a simulation study and applications to two real data examples that the proposed approach may lead to improved estimations compared with the commonly used \"minimum contrast estimation\" approach.

Journal Article

Share this book

Add to My Shelf

On Consistent Nonparametric Intensity Estimation for Inhomogeneous Spatial Point Processes

by Guan, Yongtao in Applications , Chi-squared distribution , Consistent estimators

2008

A common nonparametric approach to estimate the intensity function of an inhomogeneous spatial point process is through kernel smoothing. When conducting the smoothing, one typically uses events only in a local set around the point of interest. But the resulting estimator often is inconsistent, because the number of events in a fixed set is of order 1 for spatial point processes. In this article we propose a new covariate-based kernel smoothing method to estimate the intensity function. Our method defines the distance between any two points as the difference between their associated covariate values. Consequently, we determine the kernel weight for a given event of the process as a function of its new distance to the point of interest. Under some suitable conditions on the covariates and the spatial point process, we prove that our new estimator is consistent for the true intensity. To handle the situation with high-dimensional covariates, we also extend sliced inverse regression, a useful dimension-reduction tool in standard regression analysis, to spatial point processes. Simulations and an application to a real data example are used to demonstrate the usefulness of the proposed method.

Journal Article

Share this book

Add to My Shelf

Strong Selection at MHC in Mexicans since Admixture

by Zhou, Quan , Guan, Yongtao , Zhao, Liang in African Continental Ancestry Group - genetics , Biology and Life Sciences , Datasets

2016

Mexicans are a recent admixture of Amerindians, Europeans, and Africans. We performed local ancestry analysis of Mexican samples from two genome-wide association studies obtained from dbGaP, and discovered that at the MHC region Mexicans have excessive African ancestral alleles compared to the rest of the genome, which is the hallmark of recent selection for admixed samples. The estimated selection coefficients are 0.05 and 0.07 for two datasets, which put our finding among the strongest known selections observed in humans, namely, lactase selection in northern Europeans and sickle-cell trait in Africans. Using inaccurate Amerindian training samples was a major concern for the credibility of previously reported selection signals in Latinos. Taking advantage of the flexibility of our statistical model, we devised a model fitting technique that can learn Amerindian ancestral haplotype from the admixed samples, which allows us to infer local ancestries for Mexicans using only European and African training samples. The strong selection signal at the MHC remains without Amerindian training samples. Finally, we note that medical history studies suggest such a strong selection at MHC is plausible in Mexicans.

Journal Article

Share this book

Add to My Shelf

Bayes factor for linear mixed model in genetic association studies

by Guan, Yongtao , Levy, Daniel in Bayes factor , Bayesian analysis , BLUP

2026

Bayes factor has advantages over p‐value as test statistics for association, particularly when comparing multiple non‐nested alternative models. An efficient method to compute Bayes factor for linear mixed model (LMM) in the context of genetic association studies is lacking. In this study, we transform the standard LMM to a Bayesian linear regression by substituting the random effect with fixed effects, where the covariates of the fixed effects are eigenvectors of the genetic relatedness matrix and their respective prior effect sizes are proportional to the corresponding eigenvalues. Using conjugate normal inverse gamma priors on regression parameters, Bayes factors can be computed in a closed form. We demonstrate numerically the known relationship between Bayes factors and p‐values for the LMM. We then show that predictions based on the transformed Bayesian linear regression are identical to those of the best linear unbiased prediction (BLUP) of the standard LMM. Our results provided a new perspective and derivation to a known connection between BLUP and Bayesian estimates. Methods described in this note are implemented in the software IDUL as two new functionalities: computing Bayes factors and residuals for the LMM.

Journal Article

Share this book

Add to My Shelf

Analysis of multispecies point patterns by using multivariate log-Gaussian Cox processes

by Waagepetersen, Rasmus , Mateu, Jorge , Guan, Yongtao in Analysis of covariance , Clustering , Cross-correlation

2016

Multivariate log-Gaussian Cox processes are flexible models for multivariate point patterns. However, they have so far been applied in bivariate cases only. We move beyond the bivariate case to model multispecies point patterns of tree locations. In particular we address the problems of identifying parsimonious models and of extracting biologically relevant information from the models fitted. The latent multivariate Gaussian field is decomposed into components given in terms of random fields common to all species and components which are species specific. This allows a decomposition of variance that can be used to quantify to what extent the spatial variation of a species is governed by common or species-specific factors. Cross-validation is used to select the number of common latent fields to obtain a suitable trade-off between parsimony and fit of the data. The selected number of common latent fields provides an index of complexity of the multivariate covariance structure. Hierarchical clustering is used to identify groups of species with similar patterns of dependence on the common latent fields.

Journal Article

Share this book

Add to My Shelf

Two-step estimation for inhomogeneous spatial point processes

by Waagepetersen, Rasmus , Guan, Yongtao in Asymptotic normality , Cluster analysis , Clustering

2009

The paper is concerned with parameter estimation for inhomogeneous spatial point processes with a regression model for the intensity function and tractable second-order properties (K-function). Regression parameters are estimated by using a Poisson likelihood score estimating function and in the second step minimum contrast estimation is applied for the residual clustering parameters. Asymptotic normality of parameter estimates is established under certain mixing conditions and we exemplify how the results may be applied in ecological studies of rainforests.

Journal Article

Share this book

Add to My Shelf

Maternal nutrition at conception modulates DNA methylation of human metastable epialleles

by Dyer, Roger A. , Waterland, Robert A. , Dominguez-Salas, Paula in 45/22 , 45/77 , 631/208/176/1988

2014

In experimental animals, maternal diet during the periconceptional period influences the establishment of DNA methylation at metastable epialleles in the offspring, with permanent phenotypic consequences. Pronounced naturally occurring seasonal differences in the diet of rural Gambian women allowed us to test this in humans. We show that significant seasonal variations in methyl-donor nutrient intake of mothers around the time of conception influence 13 relevant plasma biomarkers. The level of several of these maternal biomarkers predicts increased/decreased methylation at metastable epialleles in DNA extracted from lymphocytes and hair follicles in infants postnatally. Our results demonstrate that maternal nutritional status during early pregnancy causes persistent and systemic epigenetic changes at human metastable epialleles. Maternal diet affects DNA methylation in the developing offspring, leading to phenotypic changes. Here, Dominguez-Salas et al . exploit seasonal variation in the diet of Gambian women to show that maternal methyl donor nutrient status around the time of conception predicts methylation levels at metastable epialleles in infants.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter