Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
255
result(s) for
"stratified random sampling"
Sort by:
What are the most crucial soil variables for predicting the distribution of mountain plant species? A comprehensive study in the Swiss Alps
by
Grand, Stéphanie
,
Spangenberg, Jorge E.
,
Pinto-Figueroa, Eric
in
alpine plants
,
biogeography
,
calcium oxide
2020
Aim To investigate the potential of a large range of soil variables to improve topo‐climatic models of plant species distributions in a temperate mountain region encompassing complex relief. Location The western Swiss Alps. Methods Fitting topo‐climatic models for >60 plant species across >250 sites with and without added soil predictor variables (>30). Testing included the following: (a) which soil variables improve plant species distribution models; (b) whether an optimal subset of soil variables can improve models for the majority of species and habitat types and (c) how much variation in plant species distributions soil variables alone explain. Results Geochemical variables (i.e. CaO, pH and inorganic carbon) and a drainage indicator (i.e. bulk soil water content) improved the predictive abilities of the models across the large majority of alpine plant species. The improvement of the models after the addition of soil information varied strongly between plant species and habitat types, but a trade‐off was found between the number of soil variables and the associated gain in model performance. Finally, across all species, one specific combination of soil variables – bulk soil water content + total phosphorus +δ13C – outperformed the commonly used topo‐climatic variables. Main conclusions Several soil variables significantly increased the predictive power of plant species distribution models in the temperate mountain region. Geochemical and drainage variables proved most important.
Journal Article
Assessing the Accuracy and Consistency of Six Fine-Resolution Global Land Cover Products Using a Novel Stratified Random Sampling Validation Dataset
by
Zhang, Xiao
,
Gao, Yuan
,
Liu, Liangyun
in
Accuracy
,
accuracy assessment
,
Artificial satellites in remote sensing
2023
Over the past decades, benefiting from the development of computing capacity and the free access to Landsat and Sentinel imagery, several fine-resolution global land cover (GLC) products (with a resolution of 10 m or 30 m) have been developed (GlobeLand30, FROM-GLC30, GLC_FCS30, FROM-GLC10, European Space Agency (ESA) WorldCover and ESRI Land Cover). However, there is still a lack of consistency analysis or comprehensive accuracy assessment using a common validation dataset for these GLC products. In this study, a novel stratified random sampling GLC validation dataset (SRS_Val) containing 79,112 validation samples was developed using a visual interpretation method, significantly increasing the number of samples of heterogeneous regions and rare land-cover types. Then, we quantitatively assessed the accuracy of these six GLC products using the developed SRS_Val dataset at global and regional scales. The results reveal that ESA WorldCover achieved the highest overall accuracy (of 70.54% ± 9%) among the global 10 m land cover products, followed by FROM-GLC10 (68.95% ± 8%) and ESRI Land Cover (58.90% ± 7%) and that GLC_FCS30 had the best overall accuracy (of 72.55% ± 9%) among the global 30 m land cover datasets, followed by GlobeLand30 (69.96% ± 9%) and FROM-GLC30 (66.30% ± 8%). The mapping accuracy of the GLC products decreased significantly with the increased heterogeneity of landscapes, and all GLC products had poor mapping accuracies in countries with heterogeneous landscapes, such as some countries in Central and Southern Africa. Finally, we investigated the consistency of six GLC products from the perspective of area distributions and spatial patterns. It was found that the area consistencies between the five GLC products (except ESRI Land Cover) were greater than 85% and that the six GLC products showed large discrepancies in area consistency for grassland, shrubland, wetlands and bare land. In terms of spatial patterns, the totally inconsistent pixel proportions of the 10 m and 30 m GLC products were 23.58% and 14.12%, respectively, and these inconsistent pixels were mainly distributed in transition zones, complex terrains regions, heterogeneous landscapes, or mixed land-cover types. Therefore, the SRS_Val dataset well supports the quantitative evaluation of fine-resolution GLC products, and the assessment results provide users with quantitative metrics to select GLC products suitable for their needs.
Journal Article
Stratified random sampling from streaming and stored data
by
Tirthapura, Srikanta
,
Xu, Bojian
,
Srivastava, Divesh
in
Algorithms
,
Data transmission
,
Lower bounds
2021
Stratified random sampling (SRS) is a widely used sampling technique for approximate query processing. We consider SRS on continuously arriving data streams and statically stored data sets. We present a tight lower bound showing that any streaming algorithm for SRS over the entire stream must have, in the worst case, a variance that is Ω(r) factor away from the optimal, where r is the number of strata. We present S-VOILA, a practical streaming algorithm for SRS over the entire stream that is locally variance-optimal. We prove that any sliding window-based streaming SRS needs a workspace of Ω(rMlogW) in the worst case, to maintain a variance-optimal SRS of size M, where W is the number of elements in the sliding window. Due to the inherent high workspace needs for sliding window-based SRS, we present SW-VOILA, a multi-layer practical sampling algorithm that uses only O(M) workspace but can maintain an SRS of size close to M in practice over a sliding window. Experiments show that both S-VOILA and SW-VOILA result in a variance that is typically close to their optimal offline counterparts, which was given the entire input beforehand. We also present VOILA, a variance-optimal offline algorithm for stratified random sampling. VOILA is a strict generalization of the well-known Neyman allocation, which is optimal only under the assumption that each stratum is abundant. Experiments show that VOILA can have significantly smaller variance (1.4x to 50x) than Neyman allocation on real-world data.
Journal Article
Area based stratified random sampling using geospatial technology in a community-based survey
by
Howell, Carrie R.
,
Agne, April A.
,
Cherrington, Andrea L.
in
Area based
,
Biostatistics
,
Biostatistics and methods
2020
Background
Most studies among Hispanics have focused on individual risk factors of obesity, with less attention on interpersonal, community and environmental determinants. Conducting community based surveys to study these determinants must ensure representativeness of disparate populations. We describe the use of a novel Geographic Information System (GIS)-based population based sampling to minimize selection bias in a rural community based study.
Methods
We conducted a community based survey to collect and examine social determinants of health and their association with obesity prevalence among a sample of Hispanics and non-Hispanic whites living in a rural community in the Southeastern United States. To ensure a balanced sample of both ethnic groups, we designed an area stratified random sampling procedure involving three stages: (1) division of the sampling area into non-overlapping strata based on Hispanic household proportion using GIS software; (2) random selection of the designated number of Census blocks from each stratum; and (3) random selection of the designated number of housing units (i.e., survey participants) from each Census block.
Results
The proposed sample included 109 Hispanic and 107 non-Hispanic participants to be recruited from 44 Census blocks. The final sample included 106 Hispanic and 111 non-Hispanic participants. The proportion of Hispanic surveys completed per strata matched our proposed distribution: 7% for strata 1, 30% for strata 2, 58% for strata 3 and 83% for strata 4.
Conclusion
Utilizing a standardized area based randomized sampling approach allowed us to successfully recruit an ethnically balanced sample while conducting door to door surveys in a rural, community based study. The integration of area based randomized sampling using tools such as GIS in future community-based research should be considered, particularly when trying to reach disparate populations.
Journal Article
Regional Accuracy Assessment of 30-Meter GLC_FCS30, GlobeLand30, and CLCD Products: A Case Study in Xinjiang Area
2024
With the development of remote sensing technology, a number of fine-resolution (30-m) global/national land cover (LC) products have been developed. However, accuracy assessments for the developed LC products are commonly conducted at global and national scales. Due to the limited availability of representative validation observations and reference data, knowledge relating to the accuracy and applicability of existing LC products on a regional scale is limited. Since Xinjiang, China, exhibits diverse surface cover and fragmented urban landscapes, existing LC products generally have high classification uncertainty in this region. This makes Xinjiang suitable for assessing the accuracy and consistency of exiting fine-resolution land cover products. In order to improve knowledge of the accuracy of existing fine-resolution LC products at the regional scale, Xinjiang province was selected as the case area. First, we employed an equal-area stratified random sampling approach with climate, population density, and landscape heterogeneity information as constraints, along with the hexagonal discrete global grid system (HDGGS) as basic sampling grids to develop a high-density land cover validation dataset for Xinjiang (HDLV-XJ) in 2020. This is the first publicly available regionally high-density validation dataset that can support analysis at a regional scale, comprising a total of 20,932 validation samples. Then, based on the generated HDLV-XJ dataset, the accuracies and consistency among three widely used 30-m LC products, GLC_FCS30, GlobeLand30, and CLCD, were quantitatively evaluated. The results indicated that the CLC_FCS30 exhibited the highest overall accuracy (88.10%) in Xinjiang, followed by GlobeLand30 (with an overall accuracy of 83.58%) and CLCD (81.57%). Moreover, through a comprehensive analysis of the relationship between different environmental conditions and land cover product performance, we found that GlobeLand30 performed best in regions with high landscape fragmentation, while GLC_FCS30 stood out as the most outstanding product in areas with uneven proportions of land cover types. Our study provides a novel insight into the suitability of these three widely-used LC products under various environmental conditions. The findings and dataset can provide valuable insights for the application of existing LC products in different environment conditions, offering insights into their accuracies and limitations.
Journal Article
Robust mean estimation in stratified sampling: A quantile regression approach
2026
The presence of outliers can significantly reduce the reliability of conventional estimators, often resulting in biased estimation of the population mean. This challenge is particularly relevant in practical survey data, where irregular observations frequently occur. Although this issue is well-recognized, limited work has been done on population mean estimation under stratified random sampling when outliers are present, especially within a quantile regression framework. To address this gap, the present study introduces a new class of robust estimators based on quantile regression. The proposed estimators make use of non-conventional auxiliary information to improve estimation accuracy under a stratified random sampling scheme. By relying on quantile-based methods, the suggested approach provides greater resistance to the influence of outliers. The statistical properties of the proposed estimators are derived analytically, including measures of bias and mean squared error. In addition, their performance is examined through a real-life dataset. The findings indicate that the proposed estimators offer notable gains in efficiency and robustness compared to the adopted estimators, particularly in datasets affected by outliers.
Journal Article
An efficient logarithmic estimator in stratified random sampling using single auxiliary variable
2026
This paper proposes a novel logarithmic-type estimator for the estimation of the population mean under stratified random sampling when a single auxiliary variable is available. By employing a logarithmic transformation, the suggested estimator more effectively exploits auxiliary information, leading to improved estimation accuracy. Analytical expressions for the bias and precision of the estimator are derived, and theoretical conditions are established under which the proposed estimator dominates several existing estimators. The empirical performance is assessed using three real-life datasets, along with a simulation study based on three artificial populations of size 1000 generated from a normal distribution. Samples of sizes
n
= 50, 100, and 150 are drawn to evaluate estimator behavior under varying sampling intensities. The findings consistently show that the proposed estimator attains higher percentage relative efficiency and superior precision compared to competing estimators across both real and simulated datasets. These results demonstrate that the estimator offers substantial gains in survey accuracy, particularly when a strong association exists between the study and auxiliary variables.
Journal Article
Enhanced log ratio calibration methods for stratified variance estimation in survey sampling
by
Alghamdi, Fatimah M.
,
Minhas, Kanwal Shafiq
,
Alsheikh, Sara M. A.
in
631/114
,
639/705
,
Calibration
2025
Survey sampling is a widely used technique for collecting data from a subset of a bigger population. Among its methods, stratified random sampling is particularly valuable for yielding precise inferences about distinct subgroups within a population by dividing the population into mutually exclusive strata and sampling from each group. This approach reduces sampling error and enhances the accuracy of population estimates. In this study, we propose a set of improved calibrated log-ratio-type estimators for estimating population variance under a stratified sampling framework. The performance of three proposed estimators is evaluated and compared in terms of the mean squared error. A simulation study is conducted to assess the efficiency of the estimators, complemented by a real-life application to validate the simulation results. The findings demonstrate that the proposed calibrated log-ratio variance estimators outperform existing methods by achieving lower mean squared error.
Journal Article
L-Moments and calibration-based variance estimators under double stratified random sampling scheme: Application of Covid-19 pandemic
2023
Extreme events gives rise to outrageous results in terms of population-related parameters and their estimates are usually done using traditional moments. Traditional moments are usually affected by extreme observations. This study aims to propose some new calibration estimators considering the L-Moments scheme for variance, which is one of the most important population parameters, a number of suitable calibration constraints under double stratified random sampling were defined for these estimators. The proposed estimators, which were based on L-Moments, were relatively more robust despite extreme values. The empirical efficiency of the proposed estimators was also assessed through simulation. Covid-19 pandemic data from January 22, 2020 to August 23, 2020 was taken into account in the simulation study.
Journal Article