Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
1,933
result(s) for
"Small samples"
Sort by:
Exact inference on the random-effects model for meta-analyses with few studies
2019
We describe an exact, unconditional, non-randomized procedure for producing confidence intervals for the grand mean in a normal-normal random effects meta-analysis. The procedure targets meta-analyses based on too few primary studies, ≤ 7, say, to allow for the conventional asymptotic estimators, e.g., DerSimonian and Laird (1986), or non-parametric resampling-based procedures, e.g., Liu et al. (2017). Meta-analyses with such few studies are common, with one recent sample of 22,453 heath-related meta-analyses finding a median of 3 primary studies per meta-analysis (Davey et al., 2011). Reliable and efficient inference procedures are therefore needed to address this setting. The coverage level of the resulting CI is guaranteed to be above the nominal level, up to Monte Carlo error, provided the meta-analysis contains more than 1 study and the model assumptions are met. After employing several techniques to accelerate computation, the new CI can be easily constructed on a personal computer. Simulations suggest that the proposed CI typically is not overly conservative. We illustrate the approach on several contrasting examples of meta-analyses investigating the effect of calcium intake on bone mineral density.
Journal Article
Rockburst classification based on cross reconstruction learning under small-sample condition
2024
Rockburst is a prevalent geological hazard in deep geotechnical engineering, and its accurate prognostication is vital for prevention measures. Consequently, this research proffers a pioneering classification prediction methodology, namely Cross Reconstruction Learning (CR), underpinned by conventional machine learning algorithms and metric learning strategies. Initially, this technique partitions and restructures the original dataset, where each sample feature intersects and reconfigures with features from other samples within the set. During this amalgamation, new samples are assigned labels based on the degree of divergence or congruity between two sets of sample labels, thereby forming a new set of samples. Subsequently, an array of machine learning algorithms is utilized to train and test this new dataset. Ultimately, employing a universal class voting mechanism and decoding test set results through probability assignment, the predicted labels are converted back into rock burst outcomes, thereby determining the final prediction classification. The proposed model was trained on a database encompassing 239 instance samples, and its performance was validated against the currently proficient models (KNN, XGBoost, and Random Forest algorithms) employed in rock burst prediction. The outcome revealed a decline in the performance metrics of all three machine learning algorithms when interfaced with the Cross Reconstruction learning method, particularly the KNN algorithm, owing to the doubled feature dimensions in the combined dataset. However, the metrics of ensemble models, XGBoost and Random Forest, exhibited a notable improvement compared to the original classification models. On comparing multiple performance metrics, it was discovered that the CR-XGBoost model outperformed others across all evaluations, thereby offering significant guidance for practical engineering applications.
Journal Article
Rolling Bearing Fault Detection Based on Self-Adaptive Wasserstein Dual Generative Adversarial Networks and Feature Fusion under Small Sample Conditions
2025
An intelligent diagnosis method based on self-adaptive Wasserstein dual generative adversarial networks and feature fusion is proposed due to problems such as insufficient sample size and incomplete fault feature extraction, which are commonly faced by rolling bearings and lead to low diagnostic accuracy. Initially, dual models of the Wasserstein deep convolutional generative adversarial network incorporating gradient penalty (1D-2DWDCGAN) are constructed to augment the original dataset. A self-adaptive loss threshold control training strategy is introduced, and establishing a self-adaptive balancing mechanism for stable model training. Subsequently, a diagnostic model based on multidimensional feature fusion is designed, wherein complex features from various dimensions are extracted, merging the original signal waveform features, structured features, and time-frequency features into a deep composite feature representation that encompasses multiple dimensions and scales; thus, efficient and accurate small sample fault diagnosis is facilitated. Finally, an experiment between the bearing fault dataset of Case Western Reserve University and the fault simulation experimental platform dataset of this research group shows that this method effectively supplements the dataset and remarkably improves the diagnostic accuracy. The diagnostic accuracy after data augmentation reached 99.94% and 99.87% in two different experimental environments, respectively. In addition, robustness analysis is conducted on the diagnostic accuracy of the proposed method under different noise backgrounds, verifying its good generalization performance.
Journal Article
Adult survival of Arctic terns in the Canadian High Arctic
by
Davis, Shanti E.
,
Fife, Danielle T.
,
Robertson, Gregory J.
in
AICc: Akaike Information Criterion, corrected for small sample size
,
Animal behavior
,
Aquatic birds
2018
Arctic tern (Sterna paradisaea) populations are thought to be in decline across much of their range. For long-lived seabirds, determining adult survival rates is key to understanding current population trends and predicting trajectories. We therefore examined adult survival of terns banded at our field site in the Canadian High Arctic between 2007 and 2016. Apparent adult survival was 0.883, comparable to values for other tern species and for other Arctic larids. However, using this survival rate plus first year survival values from a recent study in Iceland, we project a declining trend for terns in the Canadian High Arctic, consistent with recent reports from local ecological knowledge and limited regional surveys. Our data suggest that low adult survival is not responsible for declining tern populations, and that studies should investigate whether dispersal to new nesting locations may be underway, or that young terns are not surviving well or recruiting to the population.
Journal Article
Exact and Approximate Statistical Inference for Nonlinear Regression and the Estimating Equation Approach
2017
The exact density distribution of the non-linear least squares estimator in the one-parameter regression model is derived in closed form and expressed through the cumulative distribution function of the standard normal variable. Several proposals to generalize this result are discussed. The exact density is extended to the estimating equation (EE) approach and the non-linear regression with an arbitrary number of linear parameters and one intrinsically non-linear parameter. For a very special non-linear regression model, the derived density coincides with the distribution of the ratio of two normally distributed random variables previously obtained by Fieler almost a century ago, unlike other approximations previously suggested by other authors. Approximations to the density of the EE estimators are discussed in the multivariate case. Numerical complications associated with the non-linear least squares are illustrated, such as non-existence and/or multiple solutions, as major factors contributing to poor density approximation. The non-linear Markov–Gauss theorem is formulated on the basis of the near exact EE density approximation.
Journal Article
Small-Sample Methods for Cluster-Robust Variance Estimation and Hypothesis Testing in Fixed Effects Models
2018
In panel data models and other regressions with unobserved effects, fixed effects estimation is often paired with cluster-robust variance estimation (CRVE) to account for heteroscedasticity and un-modeled dependence among the errors. Although asymptotically consistent, CRVE can be biased downward when the number of clusters is small, leading to hypothesis tests with rejection rates that are too high. More accurate tests can be constructed using bias-reduced linearization (BRL), which corrects the CRVE based on a working model, in conjunction with a Satterthwaite approximation for t-tests. We propose a generalization of BRL that can be applied in models with arbitrary sets of fixed effects, where the original BRL method is undefined, and describe how to apply the method when the regression is estimated after absorbing the fixed effects. We also propose a small-sample test for multiple-parameter hypotheses, which generalizes the Satterthwaite approximation for t-tests. In simulations covering a wide range of scenarios, we find that the conventional cluster-robust Wald test can severely over-reject while the proposed small-sample test maintains Type I error close to nominal levels. The proposed methods are implemented in an R package called
clubSandwich
. This article has online supplementary materials.
Journal Article
SURPRISED BY THE HOT HAND FALLACY? A TRUTH IN THE LAW OF SMALL NUMBERS
2018
We prove that a subtle but substantial bias exists in a common measure of the conditional dependence of present outcomes on streaks of past outcomes in sequential data. The magnitude of this streak selection bias generally decreases as the sequence gets longer, but increases in streak length, and remains substantial for a range of sequence lengths often used in empirical work. We observe that the canonical study in the influential hot hand fallacy literature, along with replications, are vulnerable to the bias. Upon correcting for the bias, we find that the longstanding conclusions of the canonical study are reversed.
Journal Article
The Effect of Small Sample Size on Two-Level Model Estimates: A Review and Illustration
by
McNeish, Daniel M.
,
Stapleton, Laura M.
in
Child and School Psychology
,
Cluster Grouping
,
Data analysis
2016
Multilevel models are an increasingly popular method to analyze data that originate from a clustered or hierarchical structure. To effectively utilize multilevel models, one must have an adequately large number of clusters; otherwise, some model parameters will be estimated with bias. The goals for this paper are to (1) raise awareness of the problems associated with a small number of clusters, (2) review previous studies on multilevel models with a small number of clusters, (3) to provide an illustrative simulation to demonstrate how a simple model becomes adversely affected by small numbers of clusters, (4) to provide researchers with remedies if they encounter clustered data with a small number of clusters, and (5) to outline methodological topics that have yet to be addressed in the literature.
Journal Article
A screening for canine distemper virus, canine adenovirus and carnivore protoparvoviruses in Arctic foxes (Vulpes lagopus) and red foxes (Vulpes vulpes) from Arctic and sub-Arctic regions of Norway
by
Tryland, Morten
,
Yoccoz, Nigel Gilles
,
Mørk, Torill
in
adenovirus
,
Adenoviruses
,
AICc: Akaike's Information Criterion corrected for small sample size
2018
Canine distemper virus (CDV), canine adenovirus (CAdV) and canine parvovirus type 2 (CPV-2) cause disease in dogs (Canis familiaris). These, or closely related viruses, may also infect wild carnivores. The aim of this study was to investigate exposure to CDV, CAdV and CPV-2 among fox populations in Norway. Arctic foxes (n = 178) from High-Arctic Svalbard were investigated for antibodies against CDV. Arctic foxes (n = 301) from Svalbard and red foxes from LowArctic (n = 326) and sub-Arctic (n = 74) regions in Finnmark County, Norway, were investigated for antibodies against CAdV and for the presence of carnivore protoparvovirus DNA in spleen and mesenteric lymph nodes using polymerase chain reaction. Seroprevalence against CDV in Arctic foxes decreased from 25% (1995/96) to 6% (2001/02), whereas the seroprevalence against CAdV increased from 25–40% during the seasons 1995/96 to 2001/02 to 68% for the last study year (2002/03). In red foxes, the seroprevalence against CAdV varied between 31% and 67% for the seasons 2004/05 to 2007/08, increasing to 80% for the last study year. Carnivore protoparvovirus DNA was not detected in any of the 301 Arctic foxes and the 265 red foxes investigated. These results show that CDV and CAdV are enzootic in the Arctic fox population (Svalbard), and that CAdV is enzootic in both the Low-Arctic and subArctic red fox populations (Finnmark). Further studies are needed to better understand the infection biology and the impact of CDV and CAdV in these fox populations, and if viruses may be shared between foxes and other carnivores, including dogs.
Journal Article
Oversampling and replacement strategies in propensity score matching: a critical review focused on small sample size in clinical settings
by
Gregori, Dario
,
Bejko, Jonida
,
Carrozzini, Massimiliano
in
Bias
,
Clinical trials
,
Health Sciences
2021
Background
Propensity score matching is a statistical method that is often used to make inferences on the treatment effects in observational studies. In recent years, there has been widespread use of the technique in the cardiothoracic surgery literature to evaluate to potential benefits of new surgical therapies or procedures. However, the small sample size and the strong dependence of the treatment assignment on the baseline covariates that often characterize these studies make such an evaluation challenging from a statistical point of view. In such settings, the use of propensity score matching in combination with oversampling and replacement may provide a solution to these issues by increasing the initial sample size of the study and thus improving the statistical power that is needed to detect the effect of interest. In this study, we review the use of propensity score matching in combination with oversampling and replacement in small sample size settings.
Methods
We performed a series of Monte Carlo simulations to evaluate how the sample size, the proportion of treated, and the assignment mechanism affect the performances of the proposed approaches. We assessed the performances with overall balance, relative bias, root mean squared error and nominal coverage. Moreover, we illustrate the methods using a real case study from the cardiac surgery literature.
Results
Matching without replacement produced estimates with lower bias and better nominal coverage than matching with replacement when 1:1 matching was considered. In contrast to that, matching with replacement showed better balance, relative bias, and root mean squared error than matching without replacement for increasing levels of oversampling. The best nominal coverage was obtained by using the estimator that accounts for uncertainty in the matching procedure on sets of units obtained after matching with replacement.
Conclusions
The use of replacement provides the most reliable treatment effect estimates and that no more than 1 or 2 units from the control group should be matched to each treated observation. Moreover, the variance estimator that accounts for the uncertainty in the matching procedure should be used to estimate the treatment effect.
Journal Article