Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
302
result(s) for
"Semi-parametric models"
Sort by:
The Informational Content of Geographical Indications
by
Jean-Sauveur Ay
in
Lobbying
2021
Geographical indications (GIs) convey information about the place of production as a proxy for the attributes of agricultural products. We define the informational content of the GI proxy as its capacity to describe the tangible characteristics of production sites, instead of random noise or intangible factors from political bargaining about designation (i.e., lobbying effects). We estimate econometrically the informational content of wine‐related GIs for the Côte d'Or region of Burgundy, France. We show that GIs signal vineyard attributes with high precision, while we find some persistent bias from lobbying effects. We also study alternative classifications, from history and from simulations, which reveal a significant increase in the informational content of GIs over the last hundred years or so, and provide guidelines for better designated GIs in the future.
A Semi-parametric Transformation Frailty Model for Semi-competing Risks Survival Data
2017
In the analysis of semi-competing risks data interest lies in estimation and inference with respect to a so-called non-terminal event, the observation of which is subject to a terminal event. Multi-state models are commonly used to analyse such data, with covariate effects on the transition/intensity functions typically specified via the Cox model and dependence between the non-terminal and terminal events specified, in part, by a unit-specific shared frailty term. To ensure identifiability, the frailties are typically assumed to arise from a parametric distribution, specifically a Gamma distribution with mean 1.0 and variance, say, σ2. When the frailty distribution is misspecified, however, the resulting estimator is not guaranteed to be consistent, with the extent of asymptotic bias depending on the discrepancy between the assumed and true frailty distributions. In this paper, we propose a novel class of transformation models for semi-competing risks analysis that permit the non-parametric specification of the frailty distribution. To ensure identifiability, the class restricts to parametric specifications of the transformation and the error distribution; the latter are flexible, however, and cover a broad range of possible specifications. We also derive the semi-parametric efficient score under the complete data setting and propose a non-parametric score imputation method to handle right censoring; consistency and asymptotic normality of the resulting estimators is derived and small-sample operating characteristics evaluated via simulation. Although the proposed semi-parametric transformation model and non-parametric score imputation method are motivated by the analysis of semi-competing risks data, they are broadly applicable to any analysis of multivariate time-to-event outcomes in which a unit-specific shared frailty is used to account for correlation. Finally, the proposed model and estimation procedures are applied to a study of hospital readmission among patients diagnosed with pancreatic cancer.
Journal Article
Functional Partial Linear Single-index Model
by
Chen, Min
,
Feng, Xiang-Nan
,
Wang, Guochang
in
functional data analysis
,
functional data analysis, functional dimension reduction, functional semi‐parametric model, single‐index model
,
functional dimension reduction
2016
This paper deals with the problem of predicting the real-valued response variable using explanatory variables containing both multivariate random variable and random curve. The proposed functional partial linear single-index model treats the multivariate random variable as linear part and the random curve as functional single-index part, respectively. To estimate the non-parametric link function, the functional single-index and the parameters in the linear part, a two-stage estimation procedure is proposed. Compared with existing semi-parametric methods, the proposed approach requires no initial estimation and iteration. Asymptotical properties are established for both the parameters in the linear part and the functional single-index. The convergence rate for the non-parametric link function is also given. In addition, asymptotical normality of the error variance is obtained that facilitates the construction of confidence region and hypothesis testing for the unknown parameter. Numerical experiments including simulation studies and a real-data analysis are conducted to evaluate the empirical performance of the proposed method.
Journal Article
The Partial Linear Model in High Dimensions
by
Müller, Patric
,
van de Geer, Sara
in
doubly penalized Lasso
,
high-dimensional partial linear model
,
Lasso
2015
Partial linear models have been widely used as flexible method for modelling linear components in conjunction with non-parametric ones. Despite the presence of the non-parametric part, the linear, parametric part can under certain conditions be estimated with parametric rate. In this paper, we consider a high-dimensional linear part. We show that it can be estimated with oracle rates, using the least absolute shrinkage and selection operator penalty for the linear part and a smoothness penalty for the nonparametric part.
Journal Article
Analysis of Multiple Diverse Phenotypes via Semiparametric Canonical Correlation Analysis
2017
Studying multiple outcomes simultaneously allows researchers to begin to identify underlying factors that affect all of a set of diseases (i.e., shared etiology) and what may give rise to differences in disorders between patients (i.e., disease subtypes). In this work, our goal is to build risk scores that are predictive of multiple phenotypes simultaneously and identify subpopulations at high risk of multiple phenotypes. Such analyses could yield insight into etiology or point to treatment and prevention strategies. The standard canonical correlation analysis (CCA) can be used to relate multiple continuous outcomes to multiple predictors. However, in order to capture the full complexity of a disorder, phenotypes may include a diverse range of data types, including binary, continuous, ordinal, and censored variables. When phenotypes are diverse in this way, standard CCA is not possible and no methods currently exist to model them jointly. In the presence of such complications, we propose a semi-parametric CCA method to develop risk scores that are predictive of multiple phenotypes. To guard against potential model mis-specification, we also propose a nonparametric calibration method to identify subgroups that are at high risk of multiple disorders. A resampling procedure is also developed to account for the variability in these estimates. Our method opens the door to synthesizing a wide array of data sources for the purposes of joint prediction.
Journal Article
STATISTICAL INFERENCE FOR THE MEAN OUTCOME UNDER A POSSIBLY NON-UNIQUE OPTIMAL TREATMENT STRATEGY
2016
We consider challenges that arise in the estimation of the mean outcome under an optimal individualized treatment strategy defined as the treatment rule that maximizes the population mean outcome, where the candidate treatment rules are restricted to depend on baseline covariates. We prove a necessary and sufficient condition for the pathwise differentiability of the optimal value, a key condition needed to develop a regular and asymptotically linear (RAL) estimator of the optimal value. The stated condition is slightly more general than the previous condition implied in the literature. We then describe an approach to obtain root-n rate confidence intervals for the optimal value even when the parameter is not pathwise differentiable. We provide conditions under which our estimator is RAL and asymptotically efficient when the mean outcome is pathwise differentiable. We also outline an extension of our approach to a multiple time point problem. All of our results are supported by simulations.
Journal Article
Non-standard rates of convergence of criterion-function-based set estimators for binary response models
2015
This paper establishes consistency and non-standard rates of convergence for set estimators based on contour sets of criterion functions for a semi-parametric binary response model under a conditional median restriction. The model can be partially identified due to potentially limited-support regressors and an unknown distribution of errors. A set estimator analogous to the maximum score estimator is essentially cube-root consistent for the identified set when a continuous but possibly bounded regressor is present. Arbitrarily fast convergence occurs when all regressors are discrete. We also establish the validity of a subsampling procedure for constructing confidence sets for the identified set. As a technical contribution, we provide more convenient sufficient conditions on the underlying empirical processes for cube-root convergence and a sufficient condition for arbitrarily fast convergence, both of which can be applied to other models. Finally, we carry out a series of Monte Carlo experiments, which verify our theoretical findings and shed light on the finite-sample performance of the proposed procedures.
Journal Article
Meta-analysis models relaxing the random-effects normality assumption: methodological systematic review and simulation study
by
Panagiotopoulou, Kanella
,
Metelli, Silvia
,
Schmid, Christopher H
in
Binomial distribution
,
Datasets
,
Evidence synthesis
2025
Background
Random-effects meta-analysis is widely used for synthesizing the studies of a systematic review assuming a normal distribution for the study-specific effects. However, this assumption might not always be plausible. Alternative options have been suggested but not used in published meta-analyses.
Methods
We conducted a systematic review to identify articles that proposed alternative meta-analysis models assuming non-normal distributions for the random effects, such as skewed or semi-parametric distributions. Subsequently, we performed a simulation study to evaluate the performance of the identified models and to compare them with the normal model. We considered 22 scenarios varying the amount of random-effects variance, the number of included studies, and the shape of the true distribution: normal, skew-normal, and mixture of two normal distributions. For each scenario, we generated 1000 meta-analyses datasets. To investigate additional aspects of the alternative models, we also applied them at three extracted simulated datasets representing three scenarios with different true distributions.
Results
We identified in total 27 articles suggesting 24 alternative models that can be classified into three broad categories: models based on long-tail and skewed distributions, on mixtures of distributions, and on Dirichlet process priors (DP). We compared 15 models in our simulation study implemented in the Frequentist or Bayesian framework. Results revealed small differences in bias between the different models but larger differences in the level of coverage probability. Scenarios with large random-effects variance, lead to more inaccurate estimates of the mean of the random-effects distribution. However, mixture and semi-parametric models revealed latent underlying clustering of studies and assisted to form subgroups of common characteristics. The three simulated datasets demonstrated similar patterns with the simulation study for the bias of the mean of the random-effects distribution.
Conclusion
Focusing only on the mean of the random-effects distribution in meta-analysis can be misleading when substantial heterogeneity is suspected or outliers are present. In such cases, identifying the factors that differentiate the studies and looking at the prediction intervals can be very informative. Based on our simulation, investigators could have the normal model as their starting point and consider alternative models as sensitivity analysis in view of seemingly non-normal data.
Journal Article
Decision tree boosted varying coefficient models
2022
Varying coefficient models are a flexible extension of generic parametric models whose coefficients are functions of a set of effect-modifying covariates instead of fitted constants. They are capable of achieving higher model complexity while preserving the structure of the underlying parametric models, hence generating interpretable predictions. In this paper we study the use of gradient boosted decision trees as those coefficient-deciding functions in varying coefficient models with linearly structured outputs. In contrast to the traditional choices of splines or kernel smoothers, boosted trees are more flexible since they require no structural assumptions in the effect modifier space. We introduce our proposed method from the perspective of a localized version of gradient descent, prove its theoretical consistency under mild assumptions commonly adapted by decision tree research, and empirically demonstrate that the proposed tree boosted varying coefficient models achieve high performance qualified by their training speed, prediction accuracy and intelligibility as compared to several benchmark algorithms.
Journal Article
A Joint Modeling Approach for Multivariate Survival Data with Random Length
by
Manatunga, Amita K.
,
Peng, Limin
,
Marcus, Michele
in
Algorithms
,
Approximate EM algorithm
,
BIOMETRIC PRACTICE
2017
In many biomedical studies that involve correlated data, an outcome is often repeatedly measured for each individual subject along with the number of these measurements, which is also treated as an observed outcome. This type of data has been referred as multivariate random length data by Barnhart and Sampson (1995). A common approach to handling such type of data is to jointly model the multiple measurements and the random length. In previous literature, a key assumption is the multivariate normality for the multiple measurements. Motivated by a reproductive study, we propose a new copula-based joint model which relaxes the normality assumption. Specifically, we adopt the Clayton-Oakes model for multiple measurements with flexible marginal distributions specified as semi-parametric transformation models. The random length is modeled via a generalized linear model. We develop an approximate EM algorithm to derive parameter estimators and standard errors of the estimators are obtained through bootstrapping procedures and the finite-sample performance of the proposed method is investigated using simulation studies. We apply our method to the Mount Sinai Study of Women Office Workers (MSSWOW), where women were prospectively followed for 1 year for studying fertility.
Journal Article