Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
112 result(s) for "Multivariate discrete data"
Sort by:
Estimation of Copula Models With Discrete Margins via Bayesian Data Augmentation
Estimation of copula models with discrete margins can be difficult beyond the bivariate case. We show how this can be achieved by augmenting the likelihood with continuous latent variables, and computing inference using the resulting augmented posterior. To evaluate this, we propose two efficient Markov chain Monte Carlo sampling schemes. One generates the latent variables as a block using a Metropolis-Hastings step with a proposal that is close to its target distribution, the other generates them one at a time. Our method applies to all parametric copulas where the conditional copula functions can be evaluated, not just elliptical copulas as in much previous work. Moreover, the copula parameters can be estimated joint with any marginal parameters, and Bayesian selection ideas can be employed. We establish the effectiveness of the estimation method by modeling consumer behavior in online retail using Archimedean and Gaussian copulas. The example shows that elliptical copulas can be poor at modeling dependence in discrete data, just as they can be in the continuous case. To demonstrate the potential in higher dimensions, we estimate 16-dimensional D-vine copulas for a longitudinal model of usage of a bicycle path in the city of Melbourne, Australia. The estimates reveal an interesting serial dependence structure that can be represented in a parsimonious fashion using Bayesian selection of independence pair-copula components. Finally, we extend our results and method to the case where some margins are discrete and others continuous. Supplemental materials for the article are also available online.
Limited Information Goodness-of-fit Testing in Multidimensional Contingency Tables
We introduce a family of goodness-of-fit statistics for testing composite null hypotheses in multidimensional contingency tables. These statistics are quadratic forms in marginal residuals up to order r . They are asymptotically chi-square under the null hypothesis when parameters are estimated using any asymptotically normal consistent estimator. For a widely used item response model, when r is small and multidimensional tables are sparse, the proposed statistics have accurate empirical Type I errors, unlike Pearson’s X 2 . For this model in nonsparse situations, the proposed statistics are also more powerful than X 2 . In addition, the proposed statistics are asymptotically chi-square when applied to subtables, and can be used for a piecewise goodness-of-fit assessment to determine the source of misfit in poorly fitting models.
On Models for Binomial Data with Random Numbers of Trials
A binomial outcome is a count s of the number of successes out of the total number of independent trials [graphic removed] , where f is a count of the failures. The n are random variables not fixed by design in many studies. Joint modeling of [graphic removed] can provide additional insight into the science and into the probability π of success that cannot be directly incorporated by the logistic regression model. Observations where [graphic removed] are excluded from the binomial analysis yet may be important to understanding how π is influenced by covariates. Correlation between s and f may exist and be of direct interest. We propose Bayesian multivariate Poisson models for the bivariate response [graphic removed] , correlated through random effects. We extend our models to the analysis of longitudinal and multivariate longitudinal binomial outcomes. Our methodology was motivated by two disparate examples, one from teratology and one from an HIV tertiary intervention study.
The New Product Problem: An Approach for Investigating Product Failures
This paper discusses a recently developed variable selection procedure suitable when the available data have discrete components. In the context of the new product problem, a modification to the basic methodology is proposed with a view of identifying product profiles closely associated with success or early failure. A comparative analysis is also undertaken in which the new method is contrasted with the more widely used linear discriminant approach. The results indicate that the proposed methodology stands up well to the more common approach in terms of both practical and classification efficacy.
The normal law under linear restrictions: simulation and estimation via minimax tilting
Simulation from the truncated multivariate normal distribution in high dimensions is a recurrent problem in statistical computing and is typically only feasible by using approximate Markov chain Monte Carlo sampling. We propose a minimax tilting method for exact independently and identically distributed data simulation from the truncated multivariate normal distribution. The new methodology provides both a method for simulation and an efficient estimator to hitherto intractable Gaussian integrals. We prove that the estimator has a rare vanishing relative error asymptotic property. Numerical experiments suggest that the scheme proposed is accurate in a wide range of set-ups for which competing estimation schemes fail. We give an application to exact independently and identically distributed data simulation from the Bayesian posterior of the probit regression model.
Pair Copula Constructions for Multivariate Discrete Data
Multivariate discrete response data can be found in diverse fields, including econometrics, finance, biometrics, and psychometrics. Our contribution, through this study, is to introduce a new class of models for multivariate discrete data based on pair copula constructions (PCCs) that has two major advantages. First, by deriving the conditions under which any multivariate discrete distribution can be decomposed as a PCC, we show that discrete PCCs attain highly flexible dependence structures. Second, the computational burden of evaluating the likelihood for an m -dimensional discrete PCC only grows quadratically with m . This compares favorably to existing models for which computing the likelihood either requires the evaluation of 2 ᵐ terms or slow numerical integration methods. We demonstrate the high quality of inference function for margins and maximum likelihood estimates, both under a simulated setting and for an application to a longitudinal discrete dataset on headache severity. This article has online supplementary material.
Spatiotemporal forecasting models with and without a confounded covariate
The aim of this paper is to analyze the prediction accuracy of multivariate spatiotemporal forecasting models with a confounded covariate versus univariate models without covariates for discrete (count and binary) and continuous response variables by means of theoretical considerations and Monte Carlo simulation. For the simulation, we propose a Bayesian latent Gaussian Markov random fields framework for three types of generalized additive prediction models: (i) a multivariate model with a spatiotemporally confounded covariate only, denoted in the rest of the paper as the multivariate model; (ii) a univariate model with spatiotemporal random effects and their interaction only; (iii) and a full multivariate model consisting of the combination of (i) and (ii), that is, a univariate model combined with a multivariate model. One simulation result is that for all three kinds of response variables, the univariate and the full multivariate model uniformly dominate the multivariate model in terms of prediction accuracy measured by the mean-squared prediction error (MSPE). A second finding is that for discrete variables the univariate model uniformly dominates the full multivariate model. A third result is that for continuous response variables the full multivariate model dominates the univariate model in the case of low confoundedness of the covariate. For high confoundedness, the reverse holds. The results provide important guidelines for practitioners.
A Multivariate Extension of the Dynamic Logit Model for Longitudinal Data Based on a Latent Markov Heterogeneity Structure
For the analysis of multivariate categorical longitudinal data, we propose an extension of the dynamic logit model. The resulting model is based on a marginal parameterization of the conditional distribution of each vector of response variables given the covariates, the lagged response variables, and a set of subject-specific parameters for the unobserved heterogeneity. The latter ones are assumed to follow a first-order Markov chain. For the maximum likelihood estimation of the model parameters, we outline an EM algorithm. The data analysis approach based on the proposed model is illustrated by a simulation study and an application to a dataset, which derives from the Panel Study on Income Dynamics and concerns fertility and female participation to the labor market.
A multivariate Poisson-log normal mixture model for clustering transcriptome sequencing data
Background High-dimensional data of discrete and skewed nature is commonly encountered in high-throughput sequencing studies. Analyzing the network itself or the interplay between genes in this type of data continues to present many challenges. As data visualization techniques become cumbersome for higher dimensions and unconvincing when there is no clear separation between homogeneous subgroups within the data, cluster analysis provides an intuitive alternative. The aim of applying mixture model-based clustering in this context is to discover groups of co-expressed genes, which can shed light on biological functions and pathways of gene products. Results A mixture of multivariate Poisson-log normal (MPLN) model is developed for clustering of high-throughput transcriptome sequencing data. Parameter estimation is carried out using a Markov chain Monte Carlo expectation-maximization (MCMC-EM) algorithm, and information criteria are used for model selection. Conclusions The mixture of MPLN model is able to fit a wide range of correlation and overdispersion situations, and is suited for modeling multivariate count data from RNA sequencing studies. All scripts used for implementing the method can be found at https://github.com/anjalisilva/MPLNClust .