Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
14,252
result(s) for
"Spatial and Geographic Information Science"
Sort by:
Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables
by
Heuvelink, Gerard B.M.
,
Gräler, Benedikt
,
Nussbaum, Madlene
in
Algorithms
,
Artificial intelligence
,
Biogeography
2018
Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal. This paper presents a random forest for spatial predictions framework (RFsp) where buffer distances from observation points are used as explanatory variables, thus incorporating geographical proximity effects into the prediction process. The RFsp framework is illustrated with examples that use textbook datasets and apply spatial and spatio-temporal prediction to numeric, binary, categorical, multivariate and spatiotemporal variables. Performance of the RFsp framework is compared with the state-of-the-art kriging techniques using fivefold cross-validation with refitting. The results show that RFsp can obtain equally accurate and unbiased predictions as different versions of kriging. Advantages of using RFsp over kriging are that it needs no rigid statistical assumptions about the distribution and stationarity of the target variable, it is more flexible towards incorporating, combining and extending covariates of different types, and it possibly yields more informative maps characterizing the prediction error. RFsp appears to be especially attractive for building multivariate spatial prediction models that can be used as “knowledge engines” in various geoscience fields. Some disadvantages of RFsp are the exponentially growing computational intensity with increase of calibration data and covariates and the high sensitivity of predictions to input data quality. The key to the success of the RFsp framework might be the training data quality—especially quality of spatial sampling (to minimize extrapolation problems and any type of bias in data), and quality of model validation (to ensure that accuracy is not effected by overfitting). For many data sets, especially those with lower number of points and covariates and close-to-linear relationships, model-based geostatistics can still lead to more accurate predictions than RFsp.
Journal Article
Hierarchical generalized additive models in ecology: an introduction with mgcv
by
Simpson, Gavin L.
,
Pedersen, Eric J.
,
Miller, David L.
in
Computer applications
,
Data Science
,
Ecology
2019
In this paper, we discuss an extension to two popular approaches to modeling complex structures in ecological data: the generalized additive model (GAM) and the hierarchical model (HGLM). The hierarchical GAM (HGAM), allows modeling of nonlinear functional relationships between covariates and outcomes where the shape of the function itself varies between different grouping levels. We describe the theoretical connection between HGAMs, HGLMs, and GAMs, explain how to model different assumptions about the degree of intergroup variability in functional response, and show how HGAMs can be readily fitted using existing GAM software, the mgcv package in R. We also discuss computational and statistical issues with fitting these models, and demonstrate how to fit HGAMs on example data. All code and data used to generate this paper are available at: github.com/eric-pedersen/mixed-effect-gams .
Journal Article
Combining UAV-based hyperspectral imagery and machine learning algorithms for soil moisture content monitoring
2019
Soil moisture content (SMC) is an important factor that affects agricultural development in arid regions. Compared with the space-borne remote sensing system, the unmanned aerial vehicle (UAV) has been widely used because of its stronger controllability and higher resolution. It also provides a more convenient method for monitoring SMC than normal measurement methods that includes field sampling and oven-drying techniques. However, research based on UAV hyperspectral data has not yet formed a standard procedure in arid regions. Therefore, a universal processing scheme is required. We hypothesized that combining pretreatments of UAV hyperspectral imagery under optimal indices and a set of field observations within a machine learning framework will yield a highly accurate estimate of SMC. Optimal 2D spectral indices act as indispensable variables and allow us to characterize a model’s SMC performance and spatial distribution. For this purpose, we used hyperspectral imagery and a total of 70 topsoil samples (0–10 cm) from the farmland (2.5 × 10 4 m 2 ) of Fukang City, Xinjiang Uygur AutonomousRegion, China. The random forest (RF) method and extreme learning machine (ELM) were used to estimate the SMC using six methods of pretreatments combined with four optimal spectral indices. The validation accuracy of the estimated method clearly increased compared with that of linear models. The combination of pretreatments and indices by our assessment effectively eliminated the interference and the noises. Comparing two machine learning algorithms showed that the RF models were superior to the ELM models, and the best model was PIR ( R 2 val = 0.907, RMSEP = 1.477, and RPD = 3.396). The SMC map predicted via the best scheme was highly similar to the SMC map measured. We conclude that combining preprocessed spectral indices and machine learning algorithms allows estimation of SMC with high accuracy ( R 2 val = 0.907) via UAV hyperspectral imagery on a regional scale. Ultimately, our program might improve management and conservation strategies for agroecosystem systems in arid regions.
Journal Article
Growth of water hyacinth biomass and its impact on the floristic composition of aquatic plants in a wetland ecosystem of the Brahmaputra floodplain of Assam, India
2023
Inland water plants, particularly those that thrive in shallow environments, are vital to the health of aquatic ecosystems. Water hyacinth is a typical example of inland species, an invasive aquatic plant that can drastically alter the natural plant community’s floral diversity. The present study aims to assess the impact of water hyacinth biomass on the floristic characteristics of aquatic plants in the Merbil wetland of the Brahmaputra floodplain, NE, India. Using a systematic sampling technique, data were collected from the field at regular intervals for one year (2021) to estimate monthly water hyacinth biomass. The total estimate of the wetland’s biomass was made using the Kriging interpolation technique. The Shannon-Wiener diversity index ( H ′), Simpson’s diversity index ( D ), dominance and evenness or equitability index ( E ), density, and frequency were used to estimate the floristic characteristics of aquatic plants in the wetland. The result shows that the highest biomass was recorded in September (408.1 tons/ha), while the lowest was recorded in March (38 tons/ha). The floristic composition of aquatic plants was significantly influenced by water hyacinth biomass. A total of forty-one plant species from 23 different families were found in this tiny freshwater marsh during the floristic survey. Out of the total, 25 species were emergent, 11 were floating leaves, and the remaining five were free-floating habitats. Eichhornia crassipes was the wetland’s most dominant plant. A negative correlation was observed between water hyacinth biomass and the Shannon ( H ) index, Simpson diversity index, and evenness. We observed that water hyacinths had changed the plant community structure of freshwater habitats in the study area. Water hyacinth’s rapid expansion blocked out sunlight, reducing the ecosystem’s productivity and ultimately leading to species loss. The study will help devise plans for the sustainable management of natural resources and provide helpful guidance for maintaining the short- to the medium-term ecological balance in similar wetlands.
Journal Article
Global mapping of potential natural vegetation: an assessment of machine learning algorithms for estimating land potential
by
Harrison, Sandy P.
,
Walsh, Markus G.
,
Wheeler, Ichsani
in
Artificial intelligence
,
Bioclimatology
,
Biodiversity
2018
Potential natural vegetation (PNV) is the vegetation cover in equilibrium with climate, that would exist at a given location if not impacted by human activities. PNV is useful for raising public awareness about land degradation and for estimating land potential. This paper presents results of assessing machine learning algorithms—neural networks (nnet package), random forest (ranger), gradient boosting (gbm), K-nearest neighborhood (class) and Cubist—for operational mapping of PNV. Three case studies were considered: (1) global distribution of biomes based on the BIOME 6000 data set (8,057 modern pollen-based site reconstructions), (2) distribution of forest tree taxa in Europe based on detailed occurrence records (1,546,435 ground observations), and (3) global monthly fraction of absorbed photosynthetically active radiation (FAPAR) values (30,301 randomly-sampled points). A stack of 160 global maps representing biophysical conditions over land, including atmospheric, climatic, relief, and lithologic variables, were used as explanatory variables. The overall results indicate that random forest gives the overall best performance. The highest accuracy for predicting BIOME 6000 classes (20) was estimated to be between 33% (with spatial cross-validation) and 68% (simple random sub-setting), with the most important predictors being total annual precipitation, monthly temperatures, and bioclimatic layers. Predicting forest tree species (73) resulted in mapping accuracy of 25%, with the most important predictors being monthly cloud fraction, mean annual and monthly temperatures, and elevation. Regression models for FAPAR (monthly images) gave an R-square of 90% with the most important predictors being total annual precipitation, monthly cloud fraction, CHELSA bioclimatic layers, and month of the year, respectively. Further developments of PNV mapping could include using all GBIF records to map the global distribution of plant species at different taxonomic levels. This methodology could also be extended to dynamic modeling of PNV, so that future climate scenarios can be incorporated. Global maps of biomes, FAPAR and tree species at one km spatial resolution are available for download via http://dx.doi.org/10.7910/DVN/QQHCIK .
Journal Article
Quantitative estimation of soil salinity by means of different modeling methods and visible-near infrared (VIS–NIR) spectroscopy, Ebinur Lake Wetland, Northwest China
2018
Soil salinization is one of the most common forms of land degradation. The detection and assessment of soil salinity is critical for the prevention of environmental deterioration especially in arid and semi-arid areas. This study introduced the fractional derivative in the pretreatment of visible and near infrared (VIS–NIR) spectroscopy. The soil samples ( n = 400) collected from the Ebinur Lake Wetland, Xinjiang Uyghur Autonomous Region (XUAR), China, were used as the dataset. After measuring the spectral reflectance and salinity in the laboratory, the raw spectral reflectance was preprocessed by means of the absorbance and the fractional derivative order in the range of 0.0–2.0 order with an interval of 0.1. Two different modeling methods, namely, partial least squares regression (PLSR) and random forest (RF) with preprocessed reflectance were used for quantifying soil salinity. The results showed that more spectral characteristics were refined for the spectrum reflectance treated via fractional derivative. The validation accuracies showed that RF models performed better than those of PLSR. The most effective model was established based on RF with the 1.5 order derivative of absorbance with the optimal values of R 2 (0.93), RMSE (4.57 dS m −1 ), and RPD (2.78 ≥ 2.50). The developed RF model was stable and accurate in the application of spectral reflectance for determining the soil salinity of the Ebinur Lake wetland. The pretreatment of fractional derivative could be useful for monitoring multiple soil parameters with higher accuracy, which could effectively help to analyze the soil salinity.
Journal Article
Individual tree crown delineation and tree species classification with hyperspectral and LiDAR data
by
Dalponte, Michele
,
Gianelle, Damiano
,
Frizzera, Lorenzo
in
Algorithms
,
Biomass
,
Classification
2019
An international data science challenge, called National Ecological Observatory Network—National Institute of Standards and Technology data science evaluation, was set up in autumn 2017 with the goal to improve the use of remote sensing data in ecological applications. The competition was divided into three tasks: (1) individual tree crown (ITC) delineation, for identifying the location and size of individual trees; (2) alignment between field surveyed trees and ITCs delineated on remote sensing data; and (3) tree species classification. In this paper, the methods and results of team Fondazione Edmund Mach (FEM) are presented. The ITC delineation (Task 1 of the challenge) was done using a region growing method applied to a near-infrared band of the hyperspectral images. The optimization of the parameters of the delineation algorithm was done in a supervised way on the basis of the Jaccard score using the training set provided by the organizers. The alignment (Task 2) between the delineated ITCs and the field surveyed trees was done using the Euclidean distance among the position, the height, and the crown radius of the ITCs and the field surveyed trees. The classification (Task 3) was performed using a support vector machine classifier applied to a selection of the hyperspectral bands and the canopy height model. The selection of the bands was done using the sequential forward floating selection method and the Jeffries Matusita distance. The results of the three tasks were very promising: team FEM ranked first in the data science competition in Task 1 and 2, and second in Task 3. The Jaccard score of the delineated crowns was 0.3402, and the results showed that the proposed approach delineated both small and large crowns. The alignment was correctly done for all the test samples. The classification results were good (overall accuracy of 88.1%, kappa accuracy of 75.7%, and mean class accuracy of 61.5%), although the accuracy was biased toward the most represented species.
Journal Article
Soil carbon sequestration potential in global croplands
by
Padarian, José
,
Minasny, Budiman
,
McBratney, Alex
in
Agricultural land
,
Agricultural Science
,
Analysis
2022
Improving the amount of organic carbon in soils is an attractive alternative to partially mitigate climate change. However, the amount of carbon that can be potentially added to the soil is still being debated, and there is a lack of information on additional storage potential on global cropland. Soil organic carbon (SOC) sequestration potential is region-specific and conditioned by climate and management but most global estimates use fixed accumulation rates or time frames. In this study, we model SOC storage potential as a function of climate, land cover and soil. We used 83,416 SOC observations from global databases and developed a quantile regression neural network to quantify the SOC variation within soils with similar environmental characteristics. This allows us to identify similar areas that present higher SOC with the difference representing an additional storage potential. We estimated that the topsoils (0–30 cm) of global croplands (1,410 million hectares) hold 83 Pg C. The additional SOC storage potential in the topsoil of global croplands ranges from 29 to 65 Pg C. These values only equate to three to seven years of global emissions, potentially offsetting 35% of agriculture’s 85 Pg historical carbon debt estimate due to conversion from natural ecosystems. As SOC store is temperature-dependent, this potential is likely to reduce by 14% by 2040 due to climate change in a “business as usual” scenario. The results of this article can provide a guide to areas of focus for SOC sequestration, and highlight the environmental cost of agriculture.
Journal Article
Spatio-temporal evolution and prediction of carbon storage in Kunming based on PLUS and InVEST models
2023
Carbon storage is a critical ecosystem service provided by terrestrial environmental systems that can effectively reduce regional carbon emissions and is critical for achieving carbon neutrality and carbon peak. We conducted a study in Kunming and analyzed the land utilization data for 2000, 2010, and 2020. We assessed the features of land utilization conversion and forecasted land utilization under three development patterns in 2030 on the basis of the Patch-generating Land Use Simulation (PLUS) model. We used the Integrated Valuation of Ecosystem Services and Trade-offs (InVEST) model to estimate changes in carbon storage trends under three development scenarios in 2000, 2010, 2020, and 2030 and the impact of socioeconomic and natural factors on carbon storage. The results of the study indicated that (1) carbon storage is intimately associated with land utilization practices. Carbon storage in Kunming in 2000, 2010, and 2020 was 1.146 × 108 t, 1.139 × 108 t, and 1.120 × 108 t, respectively. During the 20 years, forest land decreased by 142.28 km 2 , and the decrease in forest land area caused a loss of carbon storage. (2) Carbon storage in 2030 was predicted to be 1.102 × 108 t, 1.136 × 108 t, and 1.105 × 108 t, respectively, under the trend continuation scenario, eco-friendly scenario, and comprehensive development scenario, indicating that implementing ecological protection and cultivated land protection measures can facilitate regional ecosystem carbon storage restoration. (3) Impervious surfaces and vegetation have the greatest influence on carbon storage for the study area. A spatial global and local negative correlation was found between impervious surface coverage and ecosystem carbon storage. A spatial global and local positive correlation was found between NDVI and ecosystem carbon storage. Therefore, ecological and farmland protection policies need to be strengthened, the expansion of impervious surfaces should be strictly controlled, and vegetation coverage should be improved.
Journal Article
A spatiotemporal ensemble machine learning framework for generating land use/land cover time-series maps for Europe (2000–2019) based on LUCAS, CORINE and GLAD Landsat
by
Landa, Martin
,
Antonijević, Ognjen
,
van Diemen, Chris J.
in
Accuracy
,
Automation
,
Classification
2022
A spatiotemporal machine learning framework for automated prediction and analysis of long-term Land Use/Land Cover dynamics is presented. The framework includes: (1) harmonization and preprocessing of spatial and spatiotemporal input datasets (GLAD Landsat, NPP/VIIRS) including five million harmonized LUCAS and CORINE Land Cover-derived training samples, (2) model building based on spatial k-fold cross-validation and hyper-parameter optimization, (3) prediction of the most probable class, class probabilities and model variance of predicted probabilities per pixel, (4) LULC change analysis on time-series of produced maps. The spatiotemporal ensemble model consists of a random forest, gradient boosted tree classifier, and an artificial neural network, with a logistic regressor as meta-learner. The results show that the most important variables for mapping LULC in Europe are: seasonal aggregates of Landsat green and near-infrared bands, multiple Landsat-derived spectral indices, long-term surface water probability, and elevation. Spatial cross-validation of the model indicates consistent performance across multiple years with overall accuracy (a weighted F1-score) of 0.49, 0.63, and 0.83 when predicting 43 (level-3), 14 (level-2), and five classes (level-1). Additional experiments show that spatiotemporal models generalize better to unknown years, outperforming single-year models on known-year classification by 2.7% and unknown-year classification by 3.5%. Results of the accuracy assessment using 48,365 independent test samples shows 87% match with the validation points. Results of time-series analysis (time-series of LULC probabilities and NDVI images) suggest forest loss in large parts of Sweden, the Alps, and Scotland. Positive and negative trends in NDVI in general match the land degradation and land restoration classes, with “urbanization” showing the most negative NDVI trend. An advantage of using spatiotemporal ML is that the fitted model can be used to predict LULC in years that were not included in its training dataset, allowing generalization to past and future periods, e.g. to predict LULC for years prior to 2000 and beyond 2020. The generated LULC time-series data stack (ODSE-LULC), including the training points, is publicly available via the ODSE Viewer. Functions used to prepare data and run modeling are available via the eumap library for Python.
Journal Article