Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
10,229 result(s) for "Mixture models"
Sort by:
Identifiability in N-mixture models
Binomial N-mixture models have proven very useful in ecology, conservation, and monitoring: they allow estimation and modeling of abundance separately from detection probability using simple counts. Recently, doubts about parameter identifiability have been voiced. I conducted a large-scale screening test with 137 bird data sets from 2,037 sites. I found virtually no identifiability problems for Poisson and zero-inflated Poisson (ZIP) binomial N-mixture models, but negative-binomial (NB) models had problems in 25% of all data sets. The corresponding multinomial N-mixture models had no problems. Parameter estimates under Poisson and ZIP binomial and multinomial N-mixture models were extremely similar. Identifiability problems became a little more frequent with smaller sample sizes (267 and 50 sites), but were unaffected by whether the models did or did not include covariates. Hence, binomial N-mixture model parameters with Poisson and ZIP mixtures typically appeared identifiable. In contrast, NB mixtures were often unidentifiable, which is worrying since these were often selected by Akaike’s information criterion. Identifiability of binomial N-mixture models should always be checked. If problems are found, simpler models, integrated models that combine different observation models or the use of external information via informative priors or penalized likelihoods, may help.
On the Reliability of N-Mixture Models for Count Data
N-mixture models describe count data replicated in time and across sites in terms of abundance N and detectability p. They are popular because they allow inference about N while controlling for factors that influence p without the need for marking animals. Using a capture-recapture perspective, we show that the loss of information that results from not marking animals is critical, making reliable statistical modeling of N and p problematic using just count data. One cannot reliably fit a model in which the detection probabilities are distinct among repeat visits as this model is overspecified. This makes uncontrolled variation in p problematic. By counter example, we show that even if p is constant after adjusting for covariate effects (the \"constant p\" assumption) scientifically plausible alternative models in which N (or its expectation) is non-identifiable or does not even exist as a parameter, lead to data that are practically indistinguishable from data generated under an N-mixture model. This is particularly the case for sparse data as is commonly seen in applications. We conclude that under the constant p assumption reliable inference is only possible for relative abundance in the absence of questionable and/or untestable assumptions or with better quality data than seen in typical applications. Relative abundance models for counts can be readily fitted using Poisson regression in standard software such as R and are sufficiently flexible to allow controlling for p through the use covariates while simultaneously modeling variation in relative abundance. If users require estimates of absolute abundance, they should collect auxiliary data that help with estimation of p.
Distinct trajectories of physical activity and related factors during the life course in the general population: a systematic review
Background In recent years, researchers have begun applying a trajectory approach to identify homogeneous subgroups of physical activity (PA) in heterogeneous populations. This study systematically reviewed the articles identifying longitudinal PA trajectory classes and the related factors (e.g., determinants, predictors, and outcomes) in the general population during different life phases. Methods The included studies used finite mixture models for identifying trajectories of PA, exercise, or sport participation. Three electronic databases, PubMed (Medline), Web of Science, and CINAHL, were searched from the year 2000 to 13 February 2018. The study was conducted according to the PRISMA recommendations. Results Twenty-seven articles were included and organized into three age group: youngest (eleven articles), middle (eight articles), and oldest (eight articles). The youngest group consisted mainly of youth, the middle group of adults and the oldest group of late middle-aged and older adults. Most commonly, three or four trajectory classes were reported. Several trajectories describing a decline in PA were reported, especially in the youngest group, whereas trajectories of consistently increasing PA were observed in the middle and oldest group. While the proportion of persistently physically inactive individuals increased with age, the proportion was relatively high at all ages. Generally, male gender, being Caucasian, non-smoking, having low television viewing time, higher socioeconomic status, no chronic illnesses, and family support for PA were associated either with persistent or increasing PA. Conclusions The reviewed articles identified various PA subgroups, indicating that finite mixture modeling can yield new information on the complexity of PA behavior compared to studying population mean PA level only. The studies also provided novel information how different factors relate to changes in PA during life course. The recognition of the PA subgroups and their determinants is important for the more precise targeting of PA promotion and PA interventions. Trial registration PROSPERO registration number: CRD42018088120 .
An Overview of Semiparametric Extensions of Finite Mixture Models
Finite mixture models have offered a very important tool for exploring complex data structures in many scientific areas, such as economics, epidemiology and finance. Semiparametric mixture models, which were introduced into traditional finite mixture models in the past decade, have brought forth exciting developments in their methodologies, theories, and applications. In this article, we not only provide a selective overview of the newly-developed semiparametric mixture models, but also discuss their estimation methodologies, theoretical properties if applicable, and some open questions. Recent developments are also discussed.
On the robustness of N-mixture models
N-mixture models provide an appealing alternative to mark–recapture models, in that they allow for estimation of detection probability and population size from count data, without requiring that individual animals be identified. There is, however, a cost to using the N-mixture models: inference is very sensitive to the model’s assumptions. We consider the effects of three violations of assumptions that might reasonably be expected in practice: double counting, unmodeled variation in population size over time, and unmodeled variation in detection probability over time. These three examples show that small violations of assumptions can lead to large biases in estimation. The violations of assumptions we consider are not only small qualitatively, but are also small in the sense that they are unlikely to be detected using goodness-of-fit tests. In cases where reliable estimates of population size are needed, we encourage investigators to allocate resources to acquiring additional data, such as recaptures of marked individuals, for estimation of detection probabilities.
Scalable Empirical Mixture Models That Account for Across-Site Compositional Heterogeneity
Biochemical demands constrain the range of amino acids acceptable at specific sites resulting in across-site compositional heterogeneity of the amino acid replacement process. Phylogenetic models that disregard this heterogeneity are prone to systematic errors, which can lead to severe long-branch attraction artifacts. State-of-the-art models accounting for across-site compositional heterogeneity include the CAT model, which is computationally expensive, and empirical distribution mixture models estimated via maximum likelihood (C10–C60 models). Here, we present a new, scalable method EDCluster for finding empirical distribution mixture models involving a simple cluster analysis. The cluster analysis utilizes specific coordinate transformations which allow the detection of specialized amino acid distributions either from curated databases or from the alignment at hand. We apply EDCluster to the HOGENOM and HSSP databases in order to provide universal distribution mixture (UDM) models comprising up to 4,096 components. Detailed analyses of the UDM models demonstrate the removal of various long-branch attraction artifacts and improved performance compared with the C10–C60 models. Ready-to-use implementations of the UDM models are provided for three established software packages (IQ-TREE, Phylobayes, and RevBayes).
Computationally efficient multi-sample flow cytometry data analysis using Gaussian mixture models
Background An important challenge in flow cytometry (FCM) data analysis is making comparisons of corresponding cell populations across multiple FCM samples. An interesting solution is creating a statistical mixture model for multiple samples simultaneously, as such a multi-sample model can characterize a heterogeneous set of samples, and facilitates direct comparison of cell populations across the data samples. The multi-sample approach to statistical mixture modeling has been explored in a number of reports, mostly within a Bayesian framework and with high computational complexity. Although these approaches are effective, they are also computationally demanding, and therefore do not relate well to the requirement of scalability, which is essential in the multi-sample setting. This limits their utility in the analysis of large sets of large FCM samples. Results We show that basic Gaussian mixture models can be extended to large data sets consisting of multiple samples, using a computationally efficient implementation of the expectation-maximization algorithm. We show that the multi-sample Gaussian mixture model (MSGMM) is competitive with other models, in both rare cell detection and sample classification accuracy. This allows us to further explore the utility of MSGMMs in the analysis of heterogeneous sets of samples. We demonstrate how simple heuristics on MSGMM model output can directly reveal structural patterns in a collection of FCM samples. Conclusions We recover the efficiency and utility of the basic MSGMM which underlies more complex and non-parametric Bayesian hierarchical mixture models. The possibility of fitting GMMs to large sets of FCM samples provides opportunities for the discovery of associations between sample composition and sample meta-data such as treatment responses and clinical outcomes.
Models for Estimating Abundance from Repeated Counts of an Open Metapopulation
Using only spatially and temporally replicated point counts, Royle (2004b, Biometrics 60, 108-115) developed an N-mixture model to estimate the abundance of an animal population when individual animal detection probability is unknown. One assumption inherent in this model is that the animal populations at each sampled location are closed with respect to migration, births, and deaths throughout the study. In the past this has been verified solely by biological arguments related to the study design as no statistical verification was available. In this article, we propose a generalization of the N-mixture model that can be used to formally test the closure assumption. Additionally, when applied to an open metapopulation, the generalized model provides estimates of population dynamics parameters and yields abundance estimates that account for imperfect detection probability and do not require the closure assumption. A simulation study shows these abundance estimates are less biased than the abundance estimate obtained from the original N-mixture model. The proposed model is then applied to two data sets of avian point counts. The first example demonstrates the closure test on a single-season study of Mallards (Anas platyrhynchos) , and the second uses the proposed model to estimate the population dynamics parameters and yearly abundance of American robins (Turdus migratorius) from a multi-year study.
Latent variable mixture modeling in psychiatric research – a review and application
Latent variable mixture modeling represents a flexible approach to investigating population heterogeneity by sorting cases into latent but non-arbitrary subgroups that are more homogeneous. The purpose of this selective review is to provide a non-technical introduction to mixture modeling in a cross-sectional context. Latent class analysis is used to classify individuals into homogeneous subgroups (latent classes). Factor mixture modeling represents a newer approach that represents a fusion of latent class analysis and factor analysis. Factor mixture models are adaptable to representing categorical and dimensional states of affairs. This article provides an overview of latent variable mixture models and illustrates the application of these methods by applying them to the study of the latent structure of psychotic experiences. The flexibility of latent variable mixture models makes them adaptable to the study of heterogeneity in complex psychiatric and psychological phenomena. They also allow researchers to address research questions that directly compare the viability of dimensional, categorical and hybrid conceptions of constructs.
Accounting for imperfect detection and survey bias in statistical analysis of presence‐only data
AIM: During the past decade ecologists have attempted to estimate the parameters of species distribution models by combining locations of species presence observed in opportunistic surveys with spatially referenced covariates of occurrence. Several statistical models have been proposed for the analysis of presence‐only data, but these models have largely ignored the effects of imperfect detection and survey bias. In this paper I describe a model‐based approach for the analysis of presence‐only data that accounts for errors in the detection of individuals and for biased selection of survey locations. INNOVATION: I develop a hierarchical, statistical model that allows presence‐only data to be analysed in conjunction with data acquired independently in planned surveys. One component of the model specifies the spatial distribution of individuals within a bounded, geographic region as a realization of a spatial point process. A second component of the model specifies two kinds of observations, the detection of individuals encountered during opportunistic surveys and the detection of individuals encountered during planned surveys. MAIN CONCLUSIONS: Using mathematical proof and simulation‐based comparisons, I demonstrate that biases induced by errors in detection or biased selection of survey locations can be reduced or eliminated by using the hierarchical model to analyse presence‐only data in conjunction with counts observed in planned surveys. I show that a relatively small number of high‐quality data (from planned surveys) can be used to leverage the information in presence‐only observations, which usually have broad spatial coverage but may not be informative of both occurrence and detectability of individuals. Because a variety of sampling protocols can be used in planned surveys, this approach to the analysis of presence‐only data is widely applicable. In addition, since the point‐process model is formulated at the level of an individual, it can be extended to account for biological interactions between individuals and temporal changes in their spatial distributions.