Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
188
result(s) for
"simple random sampling"
Sort by:
Evaluating Sampling Methods for Content Analysis of Twitter Data
2018
Despite the existing evaluation of the sampling options for periodical media content, only a few empirical studies have examined whether probability sampling methods can be applicable to social media content other than simple random sampling. This article tests the efficiency of simple random sampling and constructed week sampling, by varying the sample size of Twitter content related to the 2014 South Carolina gubernatorial election. We examine how many weeks were needed to adequately represent 5 months of tweets. Our findings show that a simple random sampling is more efficient than a constructed week sampling in terms of obtaining a more efficient and representative sample of Twitter data. This study also suggests that it is necessary to produce a sufficient sample size when analyzing social media content.
Journal Article
Variable surrogate model-based particle swarm optimization for high-dimensional expensive problems
2023
Many industrial applications require time-consuming and resource-intensive evaluations of suitable solutions within very limited time frames. Therefore, many surrogate-assisted evaluation algorithms (SAEAs) have been widely used to optimize expensive problems. However, due to the curse of dimensionality and its implications, scaling SAEAs to high-dimensional expensive problems is still challenging. This paper proposes a variable surrogate model-based particle swarm optimization (called VSMPSO) to meet this challenge and extends it to solve 200-dimensional problems. Specifically, a single surrogate model constructed by simple random sampling is taken to explore different promising areas in different iterations. Moreover, a variable model management strategy is used to better utilize the current global model and accelerate the convergence rate of the optimizer. In addition, the strategy can be applied to any SAEA irrespective of the surrogate model used. To control the trade-off between optimization results and optimization time consumption of SAEAs, we consider fitness value and running time as a bi-objective problem. Applying the proposed approach to a benchmark test suite of dimensions ranging from 30 to 200 and comparisons with four state-of-the-art algorithms show that the proposed VSMPSO achieves high-quality solutions and computational efficiency for high-dimensional problems.
Journal Article
On Estimating Multi- Stress Strength Reliability for Inverted Kumaraswamy Under Ranked Set Sampling with Application in Engineering
by
Nagy, Heba F.
,
Alsadat, Najwan
,
Ahmad, Hijaz
in
Estimators
,
Extreme environments
,
Mathematical Physics
2024
The terrible operating constraints of many real-world events cause systems to malfunction regularly. The failure of systems to perform their intended duties when they reach their lowest, highest, or both extreme operating conditions is a phenomenon that researchers rarely focus on. The multi-stress strength reliability
R
=
P
(
W
<
X
<
Z
)
is deemed in this study for a component whose strength
X
falls between two stresses,
W
, and
Z
, where
X
,
W
, and
Z
are independently inverted Kumaraswamy distributed. Both maximum likelihood and maximum product spacing procedures are employed to obtain the reliability estimator under simple random sampling (SRS) and ranked set sampling (RSS) methodologies. Four scenarios for reliability estimators are considered. The reliability estimator in the first and second cases can be determined by applying the same sample design (RSS/SRS) to the strength and stress distributions. When the sample data for
W
and
Z
originate from RSS while those for
X
are acquired from SRS, the third reliability estimator is calculated. The drawn data of the strength and stress random variables, which are obtained from SRS and RSS, respectively, are taken into consideration in the final scenario. The effectiveness of the suggested estimators is compared using a comprehensive computer simulation. Lastly, three real data sets have been used to determine reliability estimators.
Journal Article
The Truncated Cauchy Power Family of Distributions with Inference and Applications
by
Elbatal, Ibrahim
,
Aldahlan, Maha A.
,
Chesneau, Christophe
in
cauchy distribution
,
data analysis
,
entropy
2020
As a matter of fact, the statistical literature lacks of general family of distributions based on the truncated Cauchy distribution. In this paper, such a family is proposed, called the truncated Cauchy power-G family. It stands out for the originality of the involved functions, its overall simplicity and its desirable properties for modelling purposes. In particular, (i) only one parameter is added to the baseline distribution avoiding the over-parametrization phenomenon, (ii) the related probability functions (cumulative distribution, probability density, hazard rate, and quantile functions) have tractable expressions, and (iii) thanks to the combined action of the arctangent and power functions, the flexible properties of the baseline distribution (symmetry, skewness, kurtosis, etc.) can be really enhanced. These aspects are discussed in detail, with the support of comprehensive numerical and graphical results. Furthermore, important mathematical features of the new family are derived, such as the moments, skewness and kurtosis, two kinds of entropy and order statistics. For the applied side, new models can be created in view of fitting data sets with simple or complex structure. This last point is illustrated by the consideration of the Weibull distribution as baseline, the maximum likelihood method of estimation and two practical data sets wit different skewness properties. The obtained results show that the truncated Cauchy power-G family is very competitive in comparison to other well implanted general families.
Journal Article
Sample size requirements for stated choice experiments
2013
Stated choice (SC) experiments represent the dominant data paradigm in the study of behavioral responses of individuals, households as well as other organizations, yet in the past little has been known about the sample size requirements for models estimated from such data. Traditional orthogonal designs and existing sampling theories does not adequately address the issue and hence researchers have had to resort to simple rules of thumb or ignore the issue and collect samples of arbitrary size, hoping that the sample is sufficiently large enough to produce reliable parameter estimates, or are forced to make assumptions about the data that are unlikely to hold in practice. In this paper, we demonstrate how a recently proposed sample size computation can be used to generate so-called
S
-efficient designs using prior parameter values to estimate panel mixed multinomial logit models. Sample size requirements for such designs in SC studies are investigated. In a numerical case study is shown that a
D
-efficient and even more an
S
-efficient design require a (much) smaller sample size than a random orthogonal design in order to estimate all parameters at the level of statistical significance. Furthermore, it is shown that wide level range has a significant positive influence on the efficiency of the design and therefore on the reliability of the parameter estimates.
Journal Article
ln-Type Variance Estimators in Simple Random Sampling
2020
Until now, various types of estimators have been used for estimating the population variance in simple random sampling studies, including ratio, product, regression and exponential-type estimators. In this article, we propose a family of -type estimators for the first time in the simple random sampling and show that they are more efficient than the other types of estimators under certain conditions obtained theoretically. Numerical illustrations and a simulation study support our findings in theory. In addition, it has been shown how to determine the optimal points in order to reach the minimum MSE values with the properties of the ln-type estimators in the different data sets.
Journal Article
Quantile regression-ratio-type estimators for mean estimation under complete and partial auxiliary information
2022
Traditional Ordinary Least Square (OLS) regression is commonly utilized to develop regression-ratio-type estimators through traditional measurement of location. Abid et al. [Abid, M., Abbas, N., Zafar Nazir, H., et al. \"Enhancing the mean ratio estimators for estimating population mean using non-conventional location parameters\", Revista Colombiana de Estadística, 39(1), pp. 63-79 (2016b)], extended this idea and developed regression-ratio-type estimators based on traditional and non-traditional measures of location. In this article, the quantile regression with traditional and non-traditional measures of location is utilized and a class of ratio type mean estimators is proposed. The theoretical Mean Square Error (MSE) expressions are also derived. The work is also extended to two-phase sampling (partial information). The relationship between the proposed and existing groups of estimators is shown by considering real data collections originating from different sources. The discoveries are empowering and prevalent execution of the proposed group of the estimators is witnessed and documented throughout the article.
Journal Article
Variance Estimation under Some Transformation for Both Symmetric and Asymmetric Data
2024
This article suggests an improved class of efficient estimators that use various transformations to estimate the finite population variance of the study variable. These estimators are particularly helpful in situations where we know about the minimum and maximum values of the auxiliary variable, and the ranks of the auxiliary variable are associated with the study variable. Consequently, these rankings can be applied as an effective tool to improve the accuracy of the estimator. A first-order approximation is used to investigate the properties of the proposed class of estimators, such as bias and mean squared error (MSE) under simple random sampling. A simulation study carried out in order to measure the performance and verify the theoretical results. The suggested class of estimators has a greater percent relative efficiency (PRE) than the other existing estimators in all of the simulated situations, according to the results. Three symmetric and asymmetric datasets are examined in the application section in order to show the superior performance of the proposed class of estimators over the existing estimators.
Journal Article
Two-stage sampling for better survival model performance
2025
Background
With the emergence of high-dimensional censored survival data in health and medicine, the use of survival models for risk prediction is increasing. To date, practical techniques exist for splitting data for model training and performance evaluation. While different sampling methods have been compared for their performances, the effect of data splitting ratio and survival specific characteristics have not yet been examined for high dimensional censored survival data.
Methods
We first conduct an empirical study of using the simple random sampling technique and stratified sampling technique on real high-dimensional gene expression datasets Lasso Cox model performance. For the simple random sampling technique, various data splitting ratios are investigated. For the stratified sampling, different survival specific variables are investigated. We consider C-index and Brier Score as evaluation metrics. We further develop and validate a two-stage purposive sampling approach motivated by our empirical study findings.
Results
Our findings reveal that survival specific characteristics contribute to model performance across training, testing and validation data. The proposed two-stage purposive sampling approach performs well in mitigating excessive diversity within the training data for both simulation study and real data analysis, leading to better survival model performances.
Conclusions
We recommend careful consideration of key factors in different sampling techniques when developing and validating survival models. Using methods such as the proposed method to mitigate excessive diversity provides a solution.
Journal Article