Catalogue Search | MBRL

by Kéry, Marc in Animals , binomial N‐mixture model , Birds

2018

Binomial N-mixture models have proven very useful in ecology, conservation, and monitoring: they allow estimation and modeling of abundance separately from detection probability using simple counts. Recently, doubts about parameter identifiability have been voiced. I conducted a large-scale screening test with 137 bird data sets from 2,037 sites. I found virtually no identifiability problems for Poisson and zero-inflated Poisson (ZIP) binomial N-mixture models, but negative-binomial (NB) models had problems in 25% of all data sets. The corresponding multinomial N-mixture models had no problems. Parameter estimates under Poisson and ZIP binomial and multinomial N-mixture models were extremely similar. Identifiability problems became a little more frequent with smaller sample sizes (267 and 50 sites), but were unaffected by whether the models did or did not include covariates. Hence, binomial N-mixture model parameters with Poisson and ZIP mixtures typically appeared identifiable. In contrast, NB mixtures were often unidentifiable, which is worrying since these were often selected by Akaike’s information criterion. Identifiability of binomial N-mixture models should always be checked. If problems are found, simpler models, integrated models that combine different observation models or the use of external information via informative priors or penalized likelihoods, may help.

Journal Article

Share this book

Add to My Shelf

On the Reliability of N-Mixture Models for Count Data

by Link, William A. , Sauer, John R. , Schofield, Matthew R. in Abundance , Ancillary statistic , Animal Distribution

2018

N-mixture models describe count data replicated in time and across sites in terms of abundance N and detectability p. They are popular because they allow inference about N while controlling for factors that influence p without the need for marking animals. Using a capture-recapture perspective, we show that the loss of information that results from not marking animals is critical, making reliable statistical modeling of N and p problematic using just count data. One cannot reliably fit a model in which the detection probabilities are distinct among repeat visits as this model is overspecified. This makes uncontrolled variation in p problematic. By counter example, we show that even if p is constant after adjusting for covariate effects (the \"constant p\" assumption) scientifically plausible alternative models in which N (or its expectation) is non-identifiable or does not even exist as a parameter, lead to data that are practically indistinguishable from data generated under an N-mixture model. This is particularly the case for sparse data as is commonly seen in applications. We conclude that under the constant p assumption reliable inference is only possible for relative abundance in the absence of questionable and/or untestable assumptions or with better quality data than seen in typical applications. Relative abundance models for counts can be readily fitted using Poisson regression in standard software such as R and are sufficiently flexible to allow controlling for p through the use covariates while simultaneously modeling variation in relative abundance. If users require estimates of absolute abundance, they should collect auxiliary data that help with estimation of p.

Journal Article

Share this book

Add to My Shelf

Distinct trajectories of physical activity and related factors during the life course in the general population: a systematic review

by Tammelin, Tuija H. , Hirvensalo, Mirja , Palomäki, Sanna in Biostatistics , Chronic diseases , Elderly

2019

Background In recent years, researchers have begun applying a trajectory approach to identify homogeneous subgroups of physical activity (PA) in heterogeneous populations. This study systematically reviewed the articles identifying longitudinal PA trajectory classes and the related factors (e.g., determinants, predictors, and outcomes) in the general population during different life phases. Methods The included studies used finite mixture models for identifying trajectories of PA, exercise, or sport participation. Three electronic databases, PubMed (Medline), Web of Science, and CINAHL, were searched from the year 2000 to 13 February 2018. The study was conducted according to the PRISMA recommendations. Results Twenty-seven articles were included and organized into three age group: youngest (eleven articles), middle (eight articles), and oldest (eight articles). The youngest group consisted mainly of youth, the middle group of adults and the oldest group of late middle-aged and older adults. Most commonly, three or four trajectory classes were reported. Several trajectories describing a decline in PA were reported, especially in the youngest group, whereas trajectories of consistently increasing PA were observed in the middle and oldest group. While the proportion of persistently physically inactive individuals increased with age, the proportion was relatively high at all ages. Generally, male gender, being Caucasian, non-smoking, having low television viewing time, higher socioeconomic status, no chronic illnesses, and family support for PA were associated either with persistent or increasing PA. Conclusions The reviewed articles identified various PA subgroups, indicating that finite mixture modeling can yield new information on the complexity of PA behavior compared to studying population mean PA level only. The studies also provided novel information how different factors relate to changes in PA during life course. The recognition of the PA subgroups and their determinants is important for the more precise targeting of PA promotion and PA interventions. Trial registration PROSPERO registration number: CRD42018088120 .

Journal Article

Share this book

Add to My Shelf

An Overview of Semiparametric Extensions of Finite Mixture Models

by Yang, Guangren , Xiang, Sijia , Yao, Weixin in Data structures , Economic models , Epidemiology

2019

Finite mixture models have offered a very important tool for exploring complex data structures in many scientific areas, such as economics, epidemiology and finance. Semiparametric mixture models, which were introduced into traditional finite mixture models in the past decade, have brought forth exciting developments in their methodologies, theories, and applications. In this article, we not only provide a selective overview of the newly-developed semiparametric mixture models, but also discuss their estimation methodologies, theoretical properties if applicable, and some open questions. Recent developments are also discussed.

Journal Article

Share this book

Add to My Shelf

On the robustness of N-mixture models

by Link, William A. , Sauer, John R. , Schofield, Matthew R. in abundance estimation , Animals , Bayesian P‐value

2018

N-mixture models provide an appealing alternative to mark–recapture models, in that they allow for estimation of detection probability and population size from count data, without requiring that individual animals be identified. There is, however, a cost to using the N-mixture models: inference is very sensitive to the model’s assumptions. We consider the effects of three violations of assumptions that might reasonably be expected in practice: double counting, unmodeled variation in population size over time, and unmodeled variation in detection probability over time. These three examples show that small violations of assumptions can lead to large biases in estimation. The violations of assumptions we consider are not only small qualitatively, but are also small in the sense that they are unlikely to be detected using goodness-of-fit tests. In cases where reliable estimates of population size are needed, we encourage investigators to allocate resources to acquiring additional data, such as recaptures of marked individuals, for estimation of detection probabilities.

Journal Article

Share this book

Add to My Shelf

Scalable Empirical Mixture Models That Account for Across-Site Compositional Heterogeneity

by Lartillot, Nicolas , Schrempf, Dominik , Szöllősi, Gergely in Amino acids , Cluster analysis , Coordinate transformations

2020

Biochemical demands constrain the range of amino acids acceptable at specific sites resulting in across-site compositional heterogeneity of the amino acid replacement process. Phylogenetic models that disregard this heterogeneity are prone to systematic errors, which can lead to severe long-branch attraction artifacts. State-of-the-art models accounting for across-site compositional heterogeneity include the CAT model, which is computationally expensive, and empirical distribution mixture models estimated via maximum likelihood (C10–C60 models). Here, we present a new, scalable method EDCluster for finding empirical distribution mixture models involving a simple cluster analysis. The cluster analysis utilizes specific coordinate transformations which allow the detection of specialized amino acid distributions either from curated databases or from the alignment at hand. We apply EDCluster to the HOGENOM and HSSP databases in order to provide universal distribution mixture (UDM) models comprising up to 4,096 components. Detailed analyses of the UDM models demonstrate the removal of various long-branch attraction artifacts and improved performance compared with the C10–C60 models. Ready-to-use implementations of the UDM models are provided for three established software packages (IQ-TREE, Phylobayes, and RevBayes).

Journal Article

Share this book

Add to My Shelf

Computationally efficient multi-sample flow cytometry data analysis using Gaussian mixture models

by Cloos, Jacqueline , van Wieringen, Wessel N. , Rutten, Philip in Algorithms , Bayes Theorem , Bayesian analysis

2025

Background An important challenge in flow cytometry (FCM) data analysis is making comparisons of corresponding cell populations across multiple FCM samples. An interesting solution is creating a statistical mixture model for multiple samples simultaneously, as such a multi-sample model can characterize a heterogeneous set of samples, and facilitates direct comparison of cell populations across the data samples. The multi-sample approach to statistical mixture modeling has been explored in a number of reports, mostly within a Bayesian framework and with high computational complexity. Although these approaches are effective, they are also computationally demanding, and therefore do not relate well to the requirement of scalability, which is essential in the multi-sample setting. This limits their utility in the analysis of large sets of large FCM samples. Results We show that basic Gaussian mixture models can be extended to large data sets consisting of multiple samples, using a computationally efficient implementation of the expectation-maximization algorithm. We show that the multi-sample Gaussian mixture model (MSGMM) is competitive with other models, in both rare cell detection and sample classification accuracy. This allows us to further explore the utility of MSGMMs in the analysis of heterogeneous sets of samples. We demonstrate how simple heuristics on MSGMM model output can directly reveal structural patterns in a collection of FCM samples. Conclusions We recover the efficiency and utility of the basic MSGMM which underlies more complex and non-parametric Bayesian hierarchical mixture models. The possibility of fitting GMMs to large sets of FCM samples provides opportunities for the discovery of associations between sample composition and sample meta-data such as treatment responses and clinical outcomes.

Journal Article

Share this book

Add to My Shelf

Models for Estimating Abundance from Repeated Counts of an Open Metapopulation

by Dail, D. , Madsen, L. in Anas platyrhynchos , Animal populations , Animals

2011

Using only spatially and temporally replicated point counts, Royle (2004b, Biometrics 60, 108-115) developed an N-mixture model to estimate the abundance of an animal population when individual animal detection probability is unknown. One assumption inherent in this model is that the animal populations at each sampled location are closed with respect to migration, births, and deaths throughout the study. In the past this has been verified solely by biological arguments related to the study design as no statistical verification was available. In this article, we propose a generalization of the N-mixture model that can be used to formally test the closure assumption. Additionally, when applied to an open metapopulation, the generalized model provides estimates of population dynamics parameters and yields abundance estimates that account for imperfect detection probability and do not require the closure assumption. A simulation study shows these abundance estimates are less biased than the abundance estimate obtained from the original N-mixture model. The proposed model is then applied to two data sets of avian point counts. The first example demonstrates the closure test on a single-season study of Mallards (Anas platyrhynchos) , and the second uses the proposed model to estimate the population dynamics parameters and yearly abundance of American robins (Turdus migratorius) from a multi-year study.

Journal Article

Share this book

Add to My Shelf

Latent variable mixture modeling in psychiatric research – a review and application

by Nordström, T. , Kaakinen, M. , Ahmed, A. O. in Cross-Sectional Studies , Factor analysis , Factor Analysis, Statistical

2016

Latent variable mixture modeling represents a flexible approach to investigating population heterogeneity by sorting cases into latent but non-arbitrary subgroups that are more homogeneous. The purpose of this selective review is to provide a non-technical introduction to mixture modeling in a cross-sectional context. Latent class analysis is used to classify individuals into homogeneous subgroups (latent classes). Factor mixture modeling represents a newer approach that represents a fusion of latent class analysis and factor analysis. Factor mixture models are adaptable to representing categorical and dimensional states of affairs. This article provides an overview of latent variable mixture models and illustrates the application of these methods by applying them to the study of the latent structure of psychotic experiences. The flexibility of latent variable mixture models makes them adaptable to the study of heterogeneity in complex psychiatric and psychological phenomena. They also allow researchers to address research questions that directly compare the viability of dimensional, categorical and hybrid conceptions of constructs.

Journal Article

Share this book

Add to My Shelf

Identifying Mixtures of Mixtures Using Bayesian Estimation

by Frühwirth-Schnatter, Sylvia , Malsiner-Walli, Gertraud , Grün, Bettina in Bayesian analysis , Bayesian Models , Bayesian nonparametric mixture model

2017

The use of a finite mixture of normal distributions in model-based clustering allows us to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework, we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior, where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition, this prior allows us to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows us to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semiparametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark datasets. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter