Catalogue Search | MBRL

Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models

by Wood, Simon N. in Adaptive smoothing , Approximation , Convergence

2011

Recent work by Reiss and Ogden provides a theoretical basis for sometimes preferring restricted maximum likelihood (REML) to generalized cross-validation (GCV) for smoothing parameter selection in semiparametric regression. However, existing REML or marginal likelihood (ML) based methods for semiparametric generalized linear models (GLMs) use iterative REML or ML estimation of the smoothing parameters of working linear approximations to the GLM. Such indirect schemes need not converge and fail to do so in a non-negligible proportion of practical analyses. By contrast, very reliable prediction error criteria smoothing parameter selection methods are available, based on direct optimization of GCV, or related criteria, for the GLM itself. Since such methods directly optimize properly defined functions of the smoothing parameters, they have much more reliable convergence properties. The paper develops the first such method for REML or ML estimation of smoothing parameters. A Laplace approximation is used to obtain an approximate REML or ML for any GLM, which is suitable for efficient direct optimization. This REML or ML criterion requires that Newton-Raphson iteration, rather than Fisher scoring, be used for GLM fitting, and a computationally stable approach to this is proposed. The REML or ML criterion itself is optimized by a Newton method, with the derivatives required obtained by a mixture of implicit differentiation and direct methods. The method will cope with numerical rank deficiency in the fitted model and in fact provides a slight improvement in numerical robustness on the earlier method of Wood for prediction error criteria based smoothness selection. Simulation results suggest that the new REML and ML methods offer some improvement in mean-square error performance relative to GCV or Akaike's information criterion in most cases, without the small number of severe undersmoothing failures to which Akaike's information criterion and GCV are prone. This is achieved at the same computational cost as GCV or Akaike's information criterion. The new approach also eliminates the convergence failures of previous REML- or ML-based approaches for penalized GLMs and usually has lower computational cost than these alternatives. Example applications are presented in adaptive smoothing, scalar on function regression and generalized additive model selection.

Journal Article

Share this book

Add to My Shelf

Analytical results for directional and quadratic selection gradients for log-linear models of fitness functions

by Morrissey, Michael B. , Goudie, I. B. J. in Biological Evolution , Capture‐mark‐recapture , Empirical analysis

2022

Log-linear models are widely used for assessing determinants of fitness in empirical studies, for example, in determining how reproductive output depends on trait values or environmental conditions. Similarly, theoretical works of fitness and natural selection employ log-linear models, often with a negative quadratic term, generating Gaussian fitness functions. However, in the specific application of regression-based analysis of natural selection, such models are rarely employed. Rather, OLS regression is the predominant means of assessing the form of natural selection. OLS regressions allow specific evolutionary quantitative parameters, selection gradients, to be estimated, and benefit from the fact that the associated statistical models are easily applied. We examine whether selection gradients can be directly expressed in terms of the coefficients of models using exponential fitness functions with linear or quadratic arguments. Such models can be easily fitted with generalized linear models (GLMs). The expressions we obtain coincide with those for Gaussian functions, but relax the major constraint that the (log) fitness function is concave (downwardly curved). Additionally these results lead to univariate and multivariate analyses of both linear and quadratic selection that potentially incorporate pragmatic and interpretable models of fitness functions, where the parameters can be related analytically to selection gradients, and that can be operationalized using widely available statistical tools.

Journal Article

Share this book

Add to My Shelf

Bias reduction in exponential family nonlinear models

by Kosmidis, Ioannis , Firth, David in Applications , Asymptotic bias correction , Bias

2009

In Firth (1993, Biometrika) it was shown how the leading term in the asymptotic bias of the maximum likelihood estimator is removed by adjusting the score vector, and that in canonical-link generalized linear models the method is equivalent to maximizing a penalized likelihood that is easily implemented via iterative adjustment of the data. Here a more general family of bias-reducing adjustments is developed for a broad class of univariate and multivariate generalized nonlinear models. The resulting formulae for the adjusted score vector are computationally convenient, and in univariate models they directly suggest implementation through an iterative scheme of data adjustment. For generalized linear models a necessary and sufficient condition is given for the existence of a penalized likelihood interpretation of the method. An illustrative application to the Goodman row-column association model shows how the computational simplicity and statistical benefits of bias reduction extend beyond generalized linear models.

Journal Article

Share this book

Add to My Shelf

ON ASYMPTOTICALLY OPTIMAL CONFIDENCE REGIONS AND TESTS FOR HIGH-DIMENSIONAL MODELS

by van de Geer, Sara , Dezeure, Ruben , Bühlmann, Peter in 62F25 , 62J07 , 62J12

2014

We propose a general method for constructing confidence intervals and statistical tests for single or low-dimensional components of a large parameter vector in a high-dimensional model. It can be easily adjusted for multiplicity taking dependence among tests into account. For linear models, our method is essentially the same as in Zhang and Zhang [J. R. Stat. Soc. Ser. B Stat. Methodol. 76 (2014) 217-242]: we analyze its asymptotic properties and establish its asymptotic optimality in terms of semiparametric efficiency. Our method naturally extends to generalized linear models with convex loss functions. We develop the corresponding theory which includes a careful analysis for Gaussian, sub-Gaussian and bounded correlated designs.

Journal Article

Share this book

Add to My Shelf

Generalized linear model based on latent factors and supervised components

by Bry, Xavier , Gibaud, Julien , Trottier, Catherine in Algorithms , Covariance matrix , Generalized linear models

2025

In a context of component-based multivariate modeling we propose to model the residual dependence of the responses. Each response of a response vector is assumed to depend, through a Generalized Linear Model, on a set of explanatory variables. The vast majority of explanatory variables are partitioned into conceptually homogeneous variable groups, viewed as explanatory themes. Variables in themes are supposed many and some of them are highly correlated or even collinear. Thus, generalized linear regression demands dimension reduction and regularization with respect to each theme. Besides them, we consider a small set of “additional” covariates not conceptually linked to the themes, and demanding no regularization. Supervised Component Generalized Linear Regression proposed to both regularize and reduce the dimension of the explanatory space by searching each theme for an appropriate number of orthogonal components, which both contribute to predict the responses and capture relevant structural information in themes. In this paper, we introduce random latent variables (a.k.a. factors) so as to model the covariance matrix of the linear predictors of the responses conditional on the components. To estimate the model, we present an algorithm combining supervised component-based model estimation with factor model estimation. This methodology is tested on simulated data and then applied to an agricultural ecology dataset.

Journal Article

Share this book

Add to My Shelf

MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data

by Finak, Greg , Deng, Jingyuan , Slichter, Chloe K. in Animal Genetics and Genomics , Animals , Bioinformatics

2015

Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized linear model for such bimodal data that parameterizes both of these features. We argue that the cellular detection rate, the fraction of genes expressed in a cell, should be adjusted for as a source of nuisance variation. Our model provides gene set enrichment analysis tailored to single-cell data. It provides insights into how networks of co-expressed genes evolve across an experimental treatment. MAST is available at https://github.com/RGLab/MAST .

Journal Article

Share this book

Add to My Shelf

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse

by Schielzeth, Holger , Forstmeier, Wolfgang in Animal Ecology , Behavior modeling , Behavioral biology

2011

Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant' effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse'). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.

Journal Article

Share this book

Add to My Shelf

Double hierarchical generalized linear models (with discussion)

by Nelder, John A. , Lee, Youngjo in Algorithms , Applications , Binary data

2006

We propose a class of double hierarchical generalized linear models in which random effects can be specified for both the mean and dispersion. Heteroscedasticity between clusters can be modelled by introducing random effects in the dispersion model, as is heterogeneity between clusters in the mean model. This class will, among other things, enable models with heavy-tailed distributions to be explored, providing robust estimation against outliers. The h-likelihood provides a unified framework for this new class of models and gives a single algorithm for fitting all members of the class. This algorithm does not require quadrature or prior probabilities.

Journal Article

Share this book

Add to My Shelf

High-Dimensional Inference: Confidence Intervals, p-Values and R-Software hdi

by Meier, Lukas , Meinshausen, Nicolai , Dezeure, Ruben in Clustering , confidence interval , Confidence intervals

2015

We present a (selective) review of recent frequentist high-dimensional inference methods for constructing p-values and confidence intervals in linear and generalized linear models. We include a broad, comparative empirical study which complements the viewpoint from statistical methodology and theory. Furthermore, we introduce and illustrate the R-package hdi which easily allows the use of different methods and supports reproducibility.

Journal Article

Share this book

Add to My Shelf

Hierarchical generalised linear models: A synthesis of generalised linear models, random‐effect models and structured dispersions

by Nelder, John A. , Lee, Youngjo in Approximation , Bayesian analysis , Binomials

2001

Hierarchical generalised linear models are developed as a synthesis of generalised linear models, mixed linear models and structured dispersions. We generalise the restricted maximum likelihood method for the estimation of dispersion to the wider class and show how the joint fitting of models for mean and dispersion can be expressed by two interconnected generalised linear models. The method allows models with (i) any combination of a generalised linear model distribution for the response with any conjugate distribution for the random effects, (ii) structured dispersion components, (iii) different link and variance functions for the fixed and random effects, and (iv) the use of quasilikelihoods in place of likelihoods for either or both of the mean and dispersion models. Inferences can be made by applying standard procedures, in particular those for model checking, to components of either generalised linear model. We also show by numerical studies that the new method gives an efficient estimation procedure for substantial class of models of practical importance. Likelihood‐type inference is extended to this wide class of models in a unified way.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter