Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
57 result(s) for "Stratified random sampling technique"
Sort by:
A new modified estimator of population variance in calibrated survey sampling
In survey statistics, estimating and reducing population variation is crucial. These variations can occur in any sampling design, including stratified random sampling, where stratum weights may increase the variance of estimators. Calibration techniques, which use additional auxiliary information, can help mitigate this issue. This paper examines three calibration-based estimators—calibration variance, calibration ratio, and calibration exponential ratio estimators—within the framework of stratified random sampling. The study generates data from normal, gamma, and exponential distributions to test these estimators. Results demonstrate that the proposed calibration estimators offer more accurate estimates of population variance and outperform existing methods in estimating population variance under stratified random sampling, providing more accurate and reliable estimates.
Spatially Balanced Sampling through the Pivotal Method
A simple method to select a spatially balanced sample using equal or unequal inclusion probabilities is presented. For populations with spatial trends in the variables of interest, the estimation can be much improved by selecting samples that are well spread over the population. The method can be used for any number of dimensions and can hence also select spatially balanced samples in a space spanned by several auxiliary variables. Analysis and examples indicate that the suggested method achieves a high degree of spatial balance and is therefore efficient for populations with trends.
Copula Modeling and Uncertainty Propagation in Field‐Scale Simulation of CO2 Fault Leakage
Subsurface storage of CO2${\\mathrm{C}\\mathrm{O}}_{2}$is an important means to mitigate climate change, and the North Sea hosts considerable potential storage resources. To investigate the fate of CO2${\\mathrm{C}\\mathrm{O}}_{2}$over decades in vast reservoirs, numerical simulation based on realistic models is essential. Faults and other complex geological structures introduce modeling challenges as their effects on storage operations are subject to high uncertainty. We present a computational framework for forward propagation of uncertainty, including stochastic upscaling and copula representation of multivariate distributions for a CO2${\\mathrm{C}\\mathrm{O}}_{2}$storage site model with faults. The Vette fault zone in the Smeaheia formation in the North Sea is used as a test case. The stochastic upscaling method reduces the number of stochastic dimensions and the cost of evaluating the reservoir model. Copulas provide representation of dependent multidimensional random variables and a good fit to data, allow fast sampling and coupling to the forward propagation method via independent uniform random variables. The non‐stationary correlation within the upscaled flow functions are accurately captured by a data‐driven transformation model. The uncertainty in upscaled flow functions and other uncertain parameters are efficiently propagated to leakage estimates using numerical reservoir simulation of a two‐phase system of CO2 and brine. The expectations of leakage are estimated by an adaptive stratified sampling technique which effectively allocates samples in stochastic space. We demonstrate cost reduction compared to standard Monte Carlo of one or two orders of magnitude for simpler test cases, and factors 2–8 cost reduction for stochastic multi‐phase flow properties and more complex stochastic models. Plain Language Summary To limit global warming, greenhouse gases like CO2${\\mathrm{C}\\mathrm{O}}_{2}$can be injected into large reservoirs of porous rocks below the bottom of the sea instead of being emitted to the atmosphere. CO2${\\mathrm{C}\\mathrm{O}}_{2}$will slowly move in the reservoirs and may encounter faults, geological features that have properties that can either facilitate or stop the CO2 from moving further in the underground. It is important that the CO2 remains in the underground, and hence it is important to understand how it is affected by the fault, in particular when many physical rock properties are unknown due to very few or inexact measurements. We present methods to model the uncertainty in and surrounding the faults and show how more accurate computer simulations can be obtained by a combination of appropriate statistical models and adapted methods to investigate the effect of the fault uncertainty on the risk for leakage of CO2. Key Points Framework for efficient stochastic upscaling, modeling, and uncertainty propagation for CO2 storage, demonstrated on a North Sea test case Stochastic fault properties upscaled to two‐phase flow functions with reduced complexity and a format suitable for uncertainty propagation Significant computational cost reduction for adaptive stratified sampling compared to Monte Carlo sampling in estimation of CO2 leakage
Two-stage sampling for better survival model performance
Background With the emergence of high-dimensional censored survival data in health and medicine, the use of survival models for risk prediction is increasing. To date, practical techniques exist for splitting data for model training and performance evaluation. While different sampling methods have been compared for their performances, the effect of data splitting ratio and survival specific characteristics have not yet been examined for high dimensional censored survival data. Methods We first conduct an empirical study of using the simple random sampling technique and stratified sampling technique on real high-dimensional gene expression datasets Lasso Cox model performance. For the simple random sampling technique, various data splitting ratios are investigated. For the stratified sampling, different survival specific variables are investigated. We consider C-index and Brier Score as evaluation metrics. We further develop and validate a two-stage purposive sampling approach motivated by our empirical study findings. Results Our findings reveal that survival specific characteristics contribute to model performance across training, testing and validation data. The proposed two-stage purposive sampling approach performs well in mitigating excessive diversity within the training data for both simulation study and real data analysis, leading to better survival model performances. Conclusions We recommend careful consideration of key factors in different sampling techniques when developing and validating survival models. Using methods such as the proposed method to mitigate excessive diversity provides a solution.
Improved memory-type ratio estimator for population mean in stratified random sampling under linear and non-linear cost functions
This paper offers an improved memory-type ratio estimator in stratified random sampling under linear and non-linear cost functions. The issue is given as all integer non-linear programming problems (AINLPPs). The sampling properties mainly the bias and the mean squared error of the introduced estimator are derived up to the first order of approximation. The optimum value of the characterizing scalar is obtained by the Lagrange method of maxima–minima. The least value of the MSE of the suggested estimator is also obtained for this optimum value of the charactering constant. The suggested estimator is compared both theoretically and empirically with the competing estimators. Under this setup, the optimum allocation with mean square error of the suggested estimator is attained, and the estimator is compared to other comparable estimators. The AINLPP is solved using the genetic programming approach, which is applied to both actual and simulated data sets from a bivariate normal distribution.
Estimation of Population Mean Using Calibrated Weights in Stratified Random Successive Sampling in Presence of Incomplete Data
This article proposes a new sampling method to address challenges that often occur in estimation problems when mixed response and nonresponse patterns are observed. It introduces a range of estimators to mitigate the nonresponse effects in survey data. The properties of the proposed estimators are deeply analyzed, and calibrated weights for each stratum are derived. Numerical studies demonstrate the superiority of the proposed estimation approach over the standard conventional methods. Finally, recommendations are made to survey statisticians.
Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures
This research focuses on estimating the population median within a stratified random sampling framework by using robust statistical measures with transformation-based methodologies. An efficient estimator aims to minimize both the bias and the variance, thereby reducing the overall mean squared error (MSE, leading to more reliable outcomes. We introduce an improved class of proposed estimators that utilizes transformation techniques to effectively address data variability and enhance estimation accuracy. To evaluate their performance, we derive expressions for bias and mean square error (MSE) up to the first-order approximation for both existing and newly developed estimators, establishing theoretical conditions for their effectiveness. Additionally, the proposed estimators are compared with traditional methods using simulated populations generated from different probability distributions and actual datasets. The results indicate that the newly introduced estimators improve precision and efficiency in median estimation, yielding more reliable outcomes. When assessed against conventional estimators, the findings demonstrate that the new estimators outperform in terms of the percent relative efficiency criterion.
Predictive Understanding of Wildfire Ignitions Across the Western United States
Wildfires have increasingly affected human and natural systems across the western United States (WUS) in recent decades. Given that the majority of ignitions are human‐caused and potentially preventable, improving the ability to predict fire occurrence is critical for effective wildfire prevention and risk mitigation. We used over 500,000 wildfire ignition records from 2000 to 2020 to develop machine learning models that predict daily ignition probability across the WUS and incorporate a wide range of physical, biological, social, and administrative variables. A key innovation of this work is development of novel sampling techniques for representing ignition absence. Unlike traditional purely random sampling or hyper‐sampling, which does not account for temporally autocorrelated factors (such as droughts, insect outbreaks, and heatwaves) and spatially autocorrelated factors (such as proximity to human settlements, infrastructure presence, and fuel type), we introduce spatially and temporally stratified sampling of ignition absence. By drawing absence samples near the location and time of historical ignitions, we better captured the complex environmental and anthropogenic conditions associated with fire occurrence or lack thereof. Models trained without stratified sampling produced ignition probability maps that consistently overestimated fire risk during high fire danger periods, whereas models incorporating stratified fire absence samples more accurately captured the spatial and temporal variability of fire potential and achieved predictive accuracies exceeding 95%. In addition to operational utility for fire prevention and resource allocation, our approach offers insights into the drivers of wildfire ignitions and highlights the value of incorporating spatial and temporal structure in absence sampling for wildfire modeling. Wildfire prevention is one of the most effective and economical risk mitigation strategies. Human‐started wildfires account for over 60% of all recorded wildfires across the western United States and are responsible for the vast majority of wildfire‐related societal impacts, underscoring the value of effective wildfire prevention strategies. To address this need, we developed machine learning models that not only effectively predict spatial and temporal patterns of wildfire ignitions but reveal the nuanced interactions among physical, biological, social, and administrative factors that govern wildfire ignition outcomes. Annual temperature (climate), discovery day‐of‐year (seasonal pattern), fire year (trend), and national preparedness level (management and fire danger) were the primary governing factors in models of all ignitions, natural ignitions, and human‐caused ignitions. Secondary governing factors of natural and human‐caused ignitions, respectively, were weather‐related attributes and weather and social attributes. Our results indicated that although daily ignition probabilities generally track weather patterns, they can remain persistently high in areas where human factors dominate. Our results also show that models relying solely on weather do not accurately predict wildfire ignitions, reinforcing the fact that ignitions are caused by complex interactions among diverse factors. Daily wildfire ignition prediction models were developed and validated over the western United States Robust sampling techniques to represent conditions associated with absence of wildfire ignitions were proposed Wildfire ignitions are governed by complex interactions among physical, biological, social, and administrative factors, not weather alone
Stratified Median Estimation Using Auxiliary Transformations: A Robust and Efficient Approach in Asymmetric Populations
This study estimates the population median through stratified random sampling, which enhances accuracy by ensuring the proper representation of key population groups. The proposed class of estimators based on transformations effectively handles data variability and enhances estimation efficiency. We examine bias and mean square error expressions up to the first-order approximation for both existing and newly introduced estimators, establishing theoretical conditions for their applicability. Moreover, to assess the effectiveness of the suggested estimators, five simulated datasets derived from distinct asymmetric distributions (gamma, log-normal, Cauchy, uniform, and exponential), along with actual datasets, are used for numerical analysis. These estimators are designed to significantly enhance the precision and effectiveness of median estimation, resulting in more reliable and consistent outcomes. Comparative analysis using percent relative efficiency (PRE) reveals that the proposed estimators perform better than conventional approaches.
Remotely sensed data controlled forest inventory concept
Nowadays, the image of the forest in Germany is changing from monoculture areas to very mixed forests, where individual stands are no longer clearly visible. The objective of this study was to examine the use of remotely sensed data at enterprise level for pre-stratification and sample plot allocation in the planning stage of forest inventories in a very heterogeneous forest. On the basis of RapidEye satellite data and object-based image analysis, a stratified segment-based non-permanent sampling design was developed and evaluated against the results of a permanent systematic sampling design. The relative efficiency (RE) was calculated based on variance estimators for simple random sampling and stratified random sampling for the variable timber volume [m 3 /ha]. By stratification of the sample designs, we achieved an RE of 1.25 for the systematic sampling and 1.34 with the segment-based sampling design. Based on a targeted standard error of 4.6%, the sampling designs were compared with respect to the required sample size. The stratified segment-based sampling design reduced the number of sample plots compared to the systematic sampling design by 28%. Furthermore, it was shown that the possible reduction of sampling plots leads to a cost saving of 21%.