Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
3,501 result(s) for "kernel density"
Sort by:
Enhancing Broiler Weight Estimation through Gaussian Kernel Density Estimation Modeling
The management of individual weights in broiler farming is not only crucial for increasing farm income but also directly linked to the revenue growth of integrated broiler companies, necessitating prompt resolution. This paper proposes a model to estimate daily average broiler weights using time and weight data collected through scales. In the proposed model, a method of self-adjusting weights in the bandwidth calculation formula is employed, and the daily average weight representative value is estimated using KDE. The focus of this study is to contribute to the individual weight management of broilers by intensively researching daily fluctuations in average broiler weight. To address this, weight and time data are collected and preprocessed through scales. The Gaussian kernel density estimation model proposed in this paper aims to estimate the representative value of the daily average weight of a single broiler using statistical estimation methods, allowing for self-adjustment of bandwidth values. When applied to the dataset collected through scales, the proposed Gaussian kernel density estimation model with self-adjustable bandwidth values confirmed that the estimated daily weight did not deviate beyond the error range of ±50 g compared with the actual measured values. The next step of this study is to systematically understand the impact of the broiler environment on weight for sustainable management strategies for broiler demand, derive optimal rearing conditions for each farm by combining location and weight data, and develop a model for predicting daily average weight values. The ultimate goal is to develop an artificial intelligence model suitable for weight management systems by utilizing the estimated daily average weight of a single broiler even in the presence of error data collected from multiple weight measurements, enabling more efficient automatic measurement of broiler weight and supporting both farms and broiler demand.
Geotechnologies applied to geographic information system (GIS) of Fish farming in Rondônia state, Western Amazon
This research demonstrated a Geographic Information System (GIS) of licensed fish farms in Rondônia state, Brazil. Based on structuring of the GIS, spatial analyzes of location and distribution of fish farms were carried out in relation to highway network; to drainage; to microregions of Rondônia and the verification of the density. Methodological procedure consisted of modeling the Database (DB), whose information was obtained from Secretaria do Estado de Rondônia para Desenvolvimento Ambiental (SEDAM/RO), which holds the references of licensed fish farms processed in SPRING and ARCGIS 9 Arcmap 9.3 software. For spatial statistics, the Kernel density estimator was applied. The main result is the fact that GIS made it quick and easy to search for data and information about the fish farms studied. The highest density was 4937.64 fish farms per unit area in Ji-Paraná microregion, which is located in the Central region of Rondônia state. In thematic mapping, the fish farms showed some spatial dependencies, as follows: I – They depend on main access, highway BR 364. II – The cluster of fish farms is arranged where there is greater availability of water, that is, they depend on water courses. Therefore, positioning and distribution of fish farms take place in the three main microregions, Ji-Paraná 40.30% of licensed fish farms, followed by microregions of Cacoal 16.02% and Ariquemes 15.87%.
Integrated statistical modeling method: part I—statistical simulations for symmetric distributions
The use of parametric and nonparametric statistical modeling methods differs depending on data sufficiency. For sufficient data, the parametric statistical modeling method is preferred owing to its high convergence to the population distribution. Conversely, for insufficient data, the nonparametric method is preferred owing to its high flexibility and conservative modeling of the given data. However, it is difficult for users to select either a parametric or nonparametric modeling method because the adequacy of using one of these methods depends on how well the given data represent the population model, which is unknown to users. For insufficient data or limited prior information on random variables, the interval approach, which uses interval information of data or random variables, can be used. However, it is still difficult to be used in uncertainty analysis and design, owing to imprecise probabilities. In this study, to overcome this problem, an integrated statistical modeling (ISM) method, which combines the parametric, nonparametric, and interval approaches, is proposed. The ISM method uses the two-sample Kolmogorov–Smirnov (K–S) test to determine whether to use either the parametric or nonparametric method according to data sufficiency. The sequential statistical modeling (SSM) and kernel density estimation with estimated bounded data (KDE-ebd) are used as the parametric and nonparametric methods combined with the interval approach, respectively. To verify the modeling accuracy, conservativeness, and convergence of the proposed method, it is compared with the original SSM and KDE-ebd according to various sample sizes and distribution types in simulation tests. Through an engineering and reliability analysis example, it is shown that the proposed ISM method has the highest accuracy and reliability in the statistical modeling, regardless of data sufficiency. The ISM method is applicable to real engineering data and is conservative in the reliability analysis for insufficient data, unlike the SSM, and converges to an exact probability of failure more rapidly than KDE-ebd as data increase.
Current Situation and Optimization Strategy of Family Service Industry in Hebei Province Based on Fuzzy Mathematical Model Processing
This paper constructs an evaluation system for the development of Hebei Province’s household service industry from the aspects of policy, construction and talents, and measures the development level of Hebei Province’s household service industry as well as the characteristics of its spatio-temporal evolution by using the entropy method, the Kernel kernel density estimation and the spatial Markov chain. Based on quantitative measurement results, fuzzy mathematical theory is used to construct an optimization model to optimize the existing family service industry development in Hebei Province. In the measurement of this paper, the average value of the comprehensive water of the development of the household service industry in Hebei Province grows from 0.0739 in 2010 to 0.1305 in 2020.Although the household service industry in Shijiazhuang City can maintain a high level of growth from 2010 to 2020, the overall development of the household service industry in Hebei Province is relatively low, and there are large differences between regions. According to the fuzzy mathematical optimization model of this paper, the optimal choice of the optimization scheme for the development of the domestic service industry in Hebei Province can be obtained that the optimal strategy that Hebei Province should adopt in the development of the domestic service industry is the construction of a demonstrative domestic helper training base.
Development of a kernel density estimation with hybrid estimated bounded data
Uncertainty quantification, which identifies a probabilistic distribution for uncertain data, is important for yielding accurate and reliable results in reliability analysis and reliability-based design optimization. Sufficient data are needed for accurate uncertainty quantification, but data is very limited in engineering fields. For statistical modeling using insufficient data, kernel density estimation (KDE) with estimated bounded data (KDE-ebd) has been recently developed for more accurate and conservative estimation than the original KDE by combining given data and bounded data within estimated intervals of random variables from the given data. However, the estimated density function using KDE-ebd is modeled beyond the domain of random variables due to conservative estimation of the density function with long and thick tails. To overcome this problem, this paper proposes kernel density estimation with hybrid estimated bounded data (KDE-Hebd), which does not violate the domain of the random variables, and uses point or interval estimation of the bounds for generating the bounded data. KDE-ebd often yields too wide bounds for very insufficient data or large variations because it uses only the estimated intervals of random variables. The proposed KDE with hybrid estimated bounded data alternatively selects a point estimator or interval estimator according to whether the estimated intervals violate the domain of the random variables. The performance of the proposed method was evaluated by comparing the estimation accuracy from statistical simulation tests for mathematically derived sample data and real experimental data using KDE, KDE-ebd and KDE-Hebd. As a result, it was demonstrated that KDE-Hebd was more accurate than KDE-ebd without the violation of the domain of random variables, especially for a large coefficient of variation.
Fast Computation of Kernel Estimators
The computational complexity of evaluating the kernel density estimate (or its derivatives) at m evaluation points given n sample points scales quadratically as O(nm)-making it prohibitively expensive for large datasets. While approximate methods like binning could speed up the computation, they lack a precise control over the accuracy of the approximation. There is no straightforward way of choosing the binning parameters a priori in order to achieve a desired approximation error. We propose a novel computationally efficient ε-exact approximation algorithm for the univariate Gaussian kernel-based density derivative estimation that reduces the computational complexity from O(nm) to linear O(n+m). The user can specify a desired accuracy ε. The algorithm guarantees that the actual error between the approximation and the original kernel estimate will always be less than ε. We also apply our proposed fast algorithm to speed up automatic bandwidth selection procedures. We compare our method to the best available binning methods in terms of the speed and the accuracy. Our experimental results show that the proposed method is almost twice as fast as the best binning methods and is around five orders of magnitude more accurate. The software for the proposed method is available online.
Methods for Summarizing Radiocarbon Datasets
Bayesian models have proved very powerful in analyzing large datasets of radiocarbon (14C) measurements from specific sites and in regional cultural or political models. These models require the prior for the underlying processes that are being described to be defined, including the distribution of underlying events. Chronological information is also incorporated into Bayesian models used in DNA research, with the use of Skyline plots to show demographic trends. Despite these advances, there remain difficulties in assessing whether data conform to the assumed underlying models, and in dealing with the type of artifacts seen in Sum plots. In addition, existing methods are not applicable for situations where it is not possible to quantify the underlying process, or where sample selection is thought to have filtered the data in a way that masks the original event distribution. In this paper three different approaches are compared: “Sum” distributions, postulated undated events, and kernel density approaches. Their implementation in the OxCal program is described and their suitability for visualizing the results from chronological and geographic analyses considered for cases with and without useful prior information. The conclusion is that kernel density analysis is a powerful method that could be much more widely applied in a wide range of dating applications.
Mapping global urban boundaries from the global artificial impervious area (GAIA) data
Urban boundaries, an essential property of cities, are widely used in many urban studies. However, extracting urban boundaries from satellite images is still a great challenge, especially at a global scale and a fine resolution. In this study, we developed an automatic delineation framework to generate a multi-temporal dataset of global urban boundaries (GUB) using 30 m global artificial impervious area (GAIA) data. First, we delineated an initial urban boundary by filling inner non-urban areas of each city. A kernel density estimation approach and cellular-automata based urban growth modeling were jointly used in this step. Second, we improved the initial urban boundaries around urban fringe areas, using a morphological approach by dilating and eroding the derived urban extent. We implemented this delineation on the Google Earth Engine platform and generated a 30 m resolution global urban boundary dataset in seven representative years (i.e. 1990, 1995, 2000, 2005, 2010, 2015, and 2018). Our extracted urban boundaries show a good agreement with results derived from nighttime light data and human interpretation, and they can well delineate the urban extent of cities when compared with high-resolution Google Earth images. The total area of 65 582 GUBs, each of which exceeds 1 km2, is 809 664 km2 in 2018. The impervious surface areas account for approximately 60% of the total. From 1990 to 2018, the proportion of impervious areas in delineated boundaries increased from 53% to 60%, suggesting a compact urban growth over the past decades. We found that the United States has the highest per capita urban area (i.e. more than 900 m2) among the top 10 most urbanized nations in 2018. This dataset provides a physical boundary of urban areas that can be used to study the impact of urbanization on food security, biodiversity, climate change, and urban health. The GUB dataset can be accessed from http://data.ess.tsinghua.edu.cn.
Density-based weighting for imbalanced regression
In many real world settings, imbalanced data impedes model performance of learning algorithms, like neural networks, mostly for rare cases. This is especially problematic for tasks focusing on these rare occurrences. For example, when estimating precipitation, extreme rainfall events are scarce but important considering their potential consequences. While there are numerous well studied solutions for classification settings, most of them cannot be applied to regression easily. Of the few solutions for regression tasks, barely any have explored cost-sensitive learning which is known to have advantages compared to sampling-based methods in classification tasks. In this work, we propose a sample weighting approach for imbalanced regression datasets called DenseWeight and a cost-sensitive learning approach for neural network regression with imbalanced data called DenseLoss based on our weighting scheme. DenseWeight weights data points according to their target value rarities through kernel density estimation (KDE). DenseLoss adjusts each data point’s influence on the loss according to DenseWeight, giving rare data points more influence on model training compared to common data points. We show on multiple differently distributed datasets that DenseLoss significantly improves model performance for rare data points through its density-based weighting scheme. Additionally, we compare DenseLoss to the state-of-the-art method SMOGN, finding that our method mostly yields better performance. Our approach provides more control over model training as it enables us to actively decide on the trade-off between focusing on common or rare cases through a single hyperparameter, allowing the training of better models for rare data points.
Flexible methods for species distribution modeling with small samples
Species distribution models (SDMs) predict where species live or could potentially live and are a key resource for ecological research and conservation decision‐making. However, current SDM methods often perform poorly for rare or inadequately sampled species, which include most species on earth, as well as most of those of the greatest conservation concern. Here, we evaluated the performance of three modeling approaches designed for data‐deficient situations: plug‐and‐play modeling, density‐ratio modeling, and environmental‐range modeling. We compared the performance of algorithms within these approaches with the maximum entropy (MaxEnt) model, a widely used density‐ratio algorithm, both for data‐poor species and more generally. We also tested to what extent model cross‐validation performance on training data predicts model performance on independent, presence–absence data. We found that no algorithm performed best in all situations. Across all species, MaxEnt performed best on average but was outperformed by one or more of the plug‐and‐play, density‐ratio, or environmental‐range algorithms in 72% of cases. Six of the other algorithms had the area under the receiver operating characteristic curve (AUC) distributions not significantly different from MaxEnt's, and for data‐poor species (those with 20 or fewer occurrences), 24 of the algorithms considered had AUC distributions not significantly different from MaxEnt's. However, we found that the algorithm outputs (when thresholded to predict presence vs absence) spanned a wide sensitivity–specificity gradient. Specificity and prediction accuracy assessed on training data were strongly correlated with specificity and prediction accuracy assessed on independent presence–absence data. However, AUC and sensitivity were weakly correlated in training versus testing sets, with only 22% of species having the same model perform best when evaluated on training and independent, presence absence data. Finally, we show how ensembles of algorithms that span the sensitivity–specificity gradient can represent model disagreement in poorly sampled species and improve model predictions.