Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
13,653 result(s) for "Spatial and Geographic Information Science"
Sort by:
Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables
Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal. This paper presents a random forest for spatial predictions framework (RFsp) where buffer distances from observation points are used as explanatory variables, thus incorporating geographical proximity effects into the prediction process. The RFsp framework is illustrated with examples that use textbook datasets and apply spatial and spatio-temporal prediction to numeric, binary, categorical, multivariate and spatiotemporal variables. Performance of the RFsp framework is compared with the state-of-the-art kriging techniques using fivefold cross-validation with refitting. The results show that RFsp can obtain equally accurate and unbiased predictions as different versions of kriging. Advantages of using RFsp over kriging are that it needs no rigid statistical assumptions about the distribution and stationarity of the target variable, it is more flexible towards incorporating, combining and extending covariates of different types, and it possibly yields more informative maps characterizing the prediction error. RFsp appears to be especially attractive for building multivariate spatial prediction models that can be used as “knowledge engines” in various geoscience fields. Some disadvantages of RFsp are the exponentially growing computational intensity with increase of calibration data and covariates and the high sensitivity of predictions to input data quality. The key to the success of the RFsp framework might be the training data quality—especially quality of spatial sampling (to minimize extrapolation problems and any type of bias in data), and quality of model validation (to ensure that accuracy is not effected by overfitting). For many data sets, especially those with lower number of points and covariates and close-to-linear relationships, model-based geostatistics can still lead to more accurate predictions than RFsp.
Hierarchical generalized additive models in ecology: an introduction with mgcv
In this paper, we discuss an extension to two popular approaches to modeling complex structures in ecological data: the generalized additive model (GAM) and the hierarchical model (HGLM). The hierarchical GAM (HGAM), allows modeling of nonlinear functional relationships between covariates and outcomes where the shape of the function itself varies between different grouping levels. We describe the theoretical connection between HGAMs, HGLMs, and GAMs, explain how to model different assumptions about the degree of intergroup variability in functional response, and show how HGAMs can be readily fitted using existing GAM software, the mgcv package in R. We also discuss computational and statistical issues with fitting these models, and demonstrate how to fit HGAMs on example data. All code and data used to generate this paper are available at: github.com/eric-pedersen/mixed-effect-gams .
Combining UAV-based hyperspectral imagery and machine learning algorithms for soil moisture content monitoring
Soil moisture content (SMC) is an important factor that affects agricultural development in arid regions. Compared with the space-borne remote sensing system, the unmanned aerial vehicle (UAV) has been widely used because of its stronger controllability and higher resolution. It also provides a more convenient method for monitoring SMC than normal measurement methods that includes field sampling and oven-drying techniques. However, research based on UAV hyperspectral data has not yet formed a standard procedure in arid regions. Therefore, a universal processing scheme is required. We hypothesized that combining pretreatments of UAV hyperspectral imagery under optimal indices and a set of field observations within a machine learning framework will yield a highly accurate estimate of SMC. Optimal 2D spectral indices act as indispensable variables and allow us to characterize a model’s SMC performance and spatial distribution. For this purpose, we used hyperspectral imagery and a total of 70 topsoil samples (0–10 cm) from the farmland (2.5 × 10 4 m 2 ) of Fukang City, Xinjiang Uygur AutonomousRegion, China. The random forest (RF) method and extreme learning machine (ELM) were used to estimate the SMC using six methods of pretreatments combined with four optimal spectral indices. The validation accuracy of the estimated method clearly increased compared with that of linear models. The combination of pretreatments and indices by our assessment effectively eliminated the interference and the noises. Comparing two machine learning algorithms showed that the RF models were superior to the ELM models, and the best model was PIR ( R 2 val = 0.907, RMSEP = 1.477, and RPD = 3.396). The SMC map predicted via the best scheme was highly similar to the SMC map measured. We conclude that combining preprocessed spectral indices and machine learning algorithms allows estimation of SMC with high accuracy ( R 2 val = 0.907) via UAV hyperspectral imagery on a regional scale. Ultimately, our program might improve management and conservation strategies for agroecosystem systems in arid regions.
Growth of water hyacinth biomass and its impact on the floristic composition of aquatic plants in a wetland ecosystem of the Brahmaputra floodplain of Assam, India
Inland water plants, particularly those that thrive in shallow environments, are vital to the health of aquatic ecosystems. Water hyacinth is a typical example of inland species, an invasive aquatic plant that can drastically alter the natural plant community’s floral diversity. The present study aims to assess the impact of water hyacinth biomass on the floristic characteristics of aquatic plants in the Merbil wetland of the Brahmaputra floodplain, NE, India. Using a systematic sampling technique, data were collected from the field at regular intervals for one year (2021) to estimate monthly water hyacinth biomass. The total estimate of the wetland’s biomass was made using the Kriging interpolation technique. The Shannon-Wiener diversity index ( H ′), Simpson’s diversity index ( D ), dominance and evenness or equitability index ( E ), density, and frequency were used to estimate the floristic characteristics of aquatic plants in the wetland. The result shows that the highest biomass was recorded in September (408.1 tons/ha), while the lowest was recorded in March (38 tons/ha). The floristic composition of aquatic plants was significantly influenced by water hyacinth biomass. A total of forty-one plant species from 23 different families were found in this tiny freshwater marsh during the floristic survey. Out of the total, 25 species were emergent, 11 were floating leaves, and the remaining five were free-floating habitats. Eichhornia crassipes was the wetland’s most dominant plant. A negative correlation was observed between water hyacinth biomass and the Shannon ( H ) index, Simpson diversity index, and evenness. We observed that water hyacinths had changed the plant community structure of freshwater habitats in the study area. Water hyacinth’s rapid expansion blocked out sunlight, reducing the ecosystem’s productivity and ultimately leading to species loss. The study will help devise plans for the sustainable management of natural resources and provide helpful guidance for maintaining the short- to the medium-term ecological balance in similar wetlands.
Global mapping of potential natural vegetation: an assessment of machine learning algorithms for estimating land potential
Potential natural vegetation (PNV) is the vegetation cover in equilibrium with climate, that would exist at a given location if not impacted by human activities. PNV is useful for raising public awareness about land degradation and for estimating land potential. This paper presents results of assessing machine learning algorithms—neural networks (nnet package), random forest (ranger), gradient boosting (gbm), K-nearest neighborhood (class) and Cubist—for operational mapping of PNV. Three case studies were considered: (1) global distribution of biomes based on the BIOME 6000 data set (8,057 modern pollen-based site reconstructions), (2) distribution of forest tree taxa in Europe based on detailed occurrence records (1,546,435 ground observations), and (3) global monthly fraction of absorbed photosynthetically active radiation (FAPAR) values (30,301 randomly-sampled points). A stack of 160 global maps representing biophysical conditions over land, including atmospheric, climatic, relief, and lithologic variables, were used as explanatory variables. The overall results indicate that random forest gives the overall best performance. The highest accuracy for predicting BIOME 6000 classes (20) was estimated to be between 33% (with spatial cross-validation) and 68% (simple random sub-setting), with the most important predictors being total annual precipitation, monthly temperatures, and bioclimatic layers. Predicting forest tree species (73) resulted in mapping accuracy of 25%, with the most important predictors being monthly cloud fraction, mean annual and monthly temperatures, and elevation. Regression models for FAPAR (monthly images) gave an R-square of 90% with the most important predictors being total annual precipitation, monthly cloud fraction, CHELSA bioclimatic layers, and month of the year, respectively. Further developments of PNV mapping could include using all GBIF records to map the global distribution of plant species at different taxonomic levels. This methodology could also be extended to dynamic modeling of PNV, so that future climate scenarios can be incorporated. Global maps of biomes, FAPAR and tree species at one km spatial resolution are available for download via http://dx.doi.org/10.7910/DVN/QQHCIK .
Quantitative estimation of soil salinity by means of different modeling methods and visible-near infrared (VIS–NIR) spectroscopy, Ebinur Lake Wetland, Northwest China
Soil salinization is one of the most common forms of land degradation. The detection and assessment of soil salinity is critical for the prevention of environmental deterioration especially in arid and semi-arid areas. This study introduced the fractional derivative in the pretreatment of visible and near infrared (VIS–NIR) spectroscopy. The soil samples ( n  = 400) collected from the Ebinur Lake Wetland, Xinjiang Uyghur Autonomous Region (XUAR), China, were used as the dataset. After measuring the spectral reflectance and salinity in the laboratory, the raw spectral reflectance was preprocessed by means of the absorbance and the fractional derivative order in the range of 0.0–2.0 order with an interval of 0.1. Two different modeling methods, namely, partial least squares regression (PLSR) and random forest (RF) with preprocessed reflectance were used for quantifying soil salinity. The results showed that more spectral characteristics were refined for the spectrum reflectance treated via fractional derivative. The validation accuracies showed that RF models performed better than those of PLSR. The most effective model was established based on RF with the 1.5 order derivative of absorbance with the optimal values of R 2 (0.93), RMSE (4.57 dS m −1 ), and RPD (2.78 ≥ 2.50). The developed RF model was stable and accurate in the application of spectral reflectance for determining the soil salinity of the Ebinur Lake wetland. The pretreatment of fractional derivative could be useful for monitoring multiple soil parameters with higher accuracy, which could effectively help to analyze the soil salinity.
Soil carbon sequestration potential in global croplands
Improving the amount of organic carbon in soils is an attractive alternative to partially mitigate climate change. However, the amount of carbon that can be potentially added to the soil is still being debated, and there is a lack of information on additional storage potential on global cropland. Soil organic carbon (SOC) sequestration potential is region-specific and conditioned by climate and management but most global estimates use fixed accumulation rates or time frames. In this study, we model SOC storage potential as a function of climate, land cover and soil. We used 83,416 SOC observations from global databases and developed a quantile regression neural network to quantify the SOC variation within soils with similar environmental characteristics. This allows us to identify similar areas that present higher SOC with the difference representing an additional storage potential. We estimated that the topsoils (0–30 cm) of global croplands (1,410 million hectares) hold 83 Pg C. The additional SOC storage potential in the topsoil of global croplands ranges from 29 to 65 Pg C. These values only equate to three to seven years of global emissions, potentially offsetting 35% of agriculture’s 85 Pg historical carbon debt estimate due to conversion from natural ecosystems. As SOC store is temperature-dependent, this potential is likely to reduce by 14% by 2040 due to climate change in a “business as usual” scenario. The results of this article can provide a guide to areas of focus for SOC sequestration, and highlight the environmental cost of agriculture.
Individual tree crown delineation and tree species classification with hyperspectral and LiDAR data
An international data science challenge, called National Ecological Observatory Network—National Institute of Standards and Technology data science evaluation, was set up in autumn 2017 with the goal to improve the use of remote sensing data in ecological applications. The competition was divided into three tasks: (1) individual tree crown (ITC) delineation, for identifying the location and size of individual trees; (2) alignment between field surveyed trees and ITCs delineated on remote sensing data; and (3) tree species classification. In this paper, the methods and results of team Fondazione Edmund Mach (FEM) are presented. The ITC delineation (Task 1 of the challenge) was done using a region growing method applied to a near-infrared band of the hyperspectral images. The optimization of the parameters of the delineation algorithm was done in a supervised way on the basis of the Jaccard score using the training set provided by the organizers. The alignment (Task 2) between the delineated ITCs and the field surveyed trees was done using the Euclidean distance among the position, the height, and the crown radius of the ITCs and the field surveyed trees. The classification (Task 3) was performed using a support vector machine classifier applied to a selection of the hyperspectral bands and the canopy height model. The selection of the bands was done using the sequential forward floating selection method and the Jeffries Matusita distance. The results of the three tasks were very promising: team FEM ranked first in the data science competition in Task 1 and 2, and second in Task 3. The Jaccard score of the delineated crowns was 0.3402, and the results showed that the proposed approach delineated both small and large crowns. The alignment was correctly done for all the test samples. The classification results were good (overall accuracy of 88.1%, kappa accuracy of 75.7%, and mean class accuracy of 61.5%), although the accuracy was biased toward the most represented species.
Forest tree species distribution for Europe 2000–2020: mapping potential and realized distributions using spatiotemporal machine learning
This article describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species ( Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., Prunus avium L., Quercus cerris L., Quercus ilex L., Quercus robur L., Quercus suber L. and Salix caprea L.) at high spatial resolution (30 m). Tree occurrence data for a total of three million of points was used to train different algorithms: random forest, gradient-boosted trees, generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 305 coarse and high resolution covariates representing spectral reflectance, different biophysical conditions and biotic competition was used as predictors for realized distributions, while potential distribution was modelled with environmental predictors only. Logloss and computing time were used to select the three best algorithms to tune and train an ensemble model based on stacking with a logistic regressor as a meta-learner. An ensemble model was trained for each species: probability and model uncertainty maps of realized distribution were produced for each species using a time window of 4 years for a total of six distribution maps per species, while for potential distributions only one map per species was produced. Results of spatial cross validation show that the ensemble model consistently outperformed or performed as good as the best individual model in both potential and realized distribution tasks, with potential distribution models achieving higher predictive performances (TSS = 0.898, R 2 logloss = 0.857) than realized distribution ones on average (TSS = 0.874, R 2 logloss = 0.839). Ensemble models for Q. suber achieved the best performances in both potential (TSS = 0.968, R 2 logloss = 0.952) and realized (TSS = 0.959, R 2 logloss = 0.949) distribution, while P. sylvestris (TSS = 0.731, 0.785, R 2 logloss = 0.585, 0.670, respectively, for potential and realized distribution) and P. nigra (TSS = 0.658, 0.686, R 2 logloss = 0.623, 0.664) achieved the worst. Importance of predictor variables differed across species and models, with the green band for summer and the Normalized Difference Vegetation Index (NDVI) for fall for realized distribution and the diffuse irradiation and precipitation of the driest quarter (BIO17) being the most frequent and important for potential distribution. On average, fine-resolution models outperformed coarse resolution models (250 m) for realized distribution (TSS = +6.5%, R 2 logloss = +7.5%). The framework shows how combining continuous and consistent Earth Observation time series data with state of the art machine learning can be used to derive dynamic distribution maps. The produced predictions can be used to quantify temporal trends of potential forest degradation and species composition change.
Prophet forecasting model: a machine learning approach to predict the concentration of air pollutants (PM 2.5 , PM 10 , O 3 , NO 2 , SO 2 , CO) in Seoul, South Korea
Amidst recent industrialization in South Korea, Seoul has experienced high levels of air pollution, an issue that is magnified due to a lack of effective air pollution prediction techniques. In this study, the Prophet forecasting model (PFM) was used to predict both short-term and long-term air pollution in Seoul. The air pollutants forecasted in this study were PM 2.5 , PM 10 , O 3 , NO 2 , SO 2 , and CO, air pollutants responsible for numerous health conditions upon long-term exposure. Current chemical models to predict air pollution require complex source lists making them difficult to use. Machine learning models have also been implemented however their requirement of meteorological parameters render the models ineffective as additional models and infrastructure need to be in place to model meteorology. To address this, a model needs to be created that can accurately predict pollution based on time. A dataset containing three years worth of hourly air quality measurements in Seoul was sourced from the Seoul Open Data Plaza. To optimize the model, PFM has the following parameters: model type, change points, seasonality, holidays, and error. Cross validation was performed on the 2017–18 data; then, the model predicted 2019 values. To compare the predicted and actual values and determine the accuracy of the model, the statistical indicators: mean squared error (MSE), mean absolute error (MAE), root mean squared error (RMSE), and coverage were used. PFM predicted PM 2.5 and PM 10 with a MAE value of 12.6 µg/m 3 and 19.6 µg/m 3 , respectively. PFM also predicted SO 2 and CO with a MAE value of 0.00124 ppm and 0.207 ppm, respectively. PFM’s prediction of PM 2.5 and PM 10 had a MAE approximately 2 times and 4 times less, respectively, than comparable models. PFM’s prediction of SO 2 and CO had a MAE approximately five times and 50 times less, respectively, than comparable models. In most cases, PFM’s ability to accurately forecast the concentration of air pollutants in Seoul up to one year in advance outperformed similar models proposed in literature. This study addresses the limitations of the prior two PFM studies by expanding the modelled air pollutants from three pollutants to six pollutants while increasing the prediction time from 3 days to 1 year. This is also the first research to use PFM in Seoul, Korea. To achieve more accurate results, a larger air pollution dataset needs to be implemented with PFM. In the future, PFM should be used to predict and model air pollution in other regions, especially those without advanced infrastructure to model meteorology alongside air pollution. In Seoul, Seoul’s government can use PFM to accurately predict air pollution concentrations and plan accordingly.