Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
10,925
result(s) for
"empirical regression"
Sort by:
A Comparison of Estimating Crop Residue Cover from Sentinel-2 Data Using Empirical Regressions and Machine Learning Methods
by
Zhang, Hongyan
,
Xie, Qiaoyun
,
Wang, Yeqiao
in
Algorithms
,
artificial intelligence
,
Artificial neural networks
2020
Quantifying crop residue cover (CRC) on field surfaces is important for monitoring the tillage intensity and promoting sustainable management. Remote-sensing-based techniques have proven practical for determining CRC, however, the methods used are primarily limited to empirical regression based on crop residue indices (CRIs). This study provides a systematic evaluation of empirical regressions and machine learning (ML) algorithms based on their ability to estimate CRC using Sentinel-2 Multispectral Instrument (MSI) data. Unmanned aerial vehicle orthomosaics were used to extracted ground CRC for training Sentinel-2 data-based CRC models. For empirical regression, nine MSI bands, 10 published CRIs, three proposed CRIs, and four mean textural features were evaluated using univariate linear regression. The best performance was obtained by a three-band index calculated using (B2 − B4)/(B2 − B12), with an R2cv of 0.63 and RMSEcv of 6.509%, using a 10-fold cross-validation. The methodologies of partial least squares regression (PLSR), artificial neural network (ANN), Gaussian process regression (GPR), support vector regression (SVR), and random forest (RF) were compared with four groups of predictors, including nine MSI bands, 13 CRIs, a combination of MSI bands and mean textural features, and a combination of CRIs and textural features. In general, ML approaches achieved high accuracy. A PLSR model with 13 CRIs and textural features resulted in an accuracy of R2cv = 0.66 and RMSEcv = 6.427%. An RF model with predictors of MSI bands and textural features estimated CRC with an R2cv = 0.61 and RMSEcv = 6.415%. The estimation was improved by an SVR model with the same input predictors (R2cv = 0.67, RMSEcv = 6.343%), followed by a GPR model based on CRIs and textural features. The performance of GPR models was further improved by optimal input variables. A GPR model with six input variables, three MSI bands and three textural features, performed the best, with R2cv = 0.69 and RMSEcv = 6.149%. This study provides a reference for estimating CRC from Sentinel-2 imagery using ML approaches. The GPR approach is recommended. A combination of spectral information and textural features leads to an improvement in the retrieval of CRC.
Journal Article
An updated review of Goodness-of-Fit tests for regression models
2013
This survey intends to collect the developments on Goodness-of-Fit for regression models during the last 20 years, from the very first origins with the proposals based on the idea of the tests for density and distribution, until the most recent advances for complex data and models. Far from being exhaustive, the contents in this paper are focused on two main classes of tests statistics: smoothing-based tests (kernel-based) and tests based on empirical regression processes, although other tests based on Maximum Likelihood ideas will be also considered. Starting from the simplest case of testing a parametric family for the regression curves, the contributions in this field provide also testing procedures in semiparametric, nonparametric, and functional models, dealing also with more complex settings, as those ones involving dependent or incomplete data.
Journal Article
Evaluating Empirical Regression, Machine Learning, and Radiative Transfer Modelling for Estimating Vegetation Chlorophyll Content Using Bi-Seasonal Hyperspectral Images
2019
Different types of methods have been developed to retrieve vegetation attributes from remote sensing data, including conventional empirical regressions (i.e., linear regression (LR)), advanced empirical regressions (e.g., multivariable linear regression (MLR), partial least square regression (PLSR)), machine learning (e.g., random forest regression (RFR), decision tree regression (DTR)), and radiative transfer modelling (RTM, e.g., PROSAIL). Given that each algorithm has its own strengths and weaknesses, it is essential to compare them and evaluate their effectiveness. Previous studies have mainly used single-date multispectral imagery or ground-based hyperspectral reflectance data for evaluating the models, while multi-seasonal hyperspectral images have been rarely used. Extensive spectral and spatial information in hyperspectral images, as well as temporal variations of landscapes, potentially influence the model performance. In this research, LR, PLSR, RFR, and PROSAIL, representing different types of methods, were evaluated for estimating vegetation chlorophyll content from bi-seasonal hyperspectral images (i.e., a middle- and a late-growing season image, respectively). Results show that the PLSR and RFR generally performed better than LR and PROSAIL. RFR achieved the highest accuracy for both images. This research provides insights on the effectiveness of different models for estimating vegetation chlorophyll content using hyperspectral images, aiming to support future vegetation monitoring research.
Journal Article
Linear regression MDP scheme for discrete backward stochastic differential equations under general conditions
2015
We design a numerical scheme for solving the Multi-step Forward Dynamic Programming (MDP) equation arising from the time-discretization of backward stochastic differential equations. The generator is assumed to be locally Lipschitz, which includes some cases of quadratic drivers. When the large sequence of conditional expectations is computed using empirical least-squares regressions, under general conditions we establish an upper bound error as the average, rather than the sum, of local regression errors only, suggesting that our error estimation is tight. Despite the nested regression problems, the interdependency errors are justified to be at most of the order of the statistical regression errors (up to logarithmic factor). Finally, we optimize the algorithm parameters, depending on the dimension and on the smoothness of value functions, in the limit as the time mesh size goes to zero and we compute the complexity needed to achieve a given accuracy. Numerical experiments are presented illustrating theoretical convergence estimates.
Journal Article
Impact of Land Use Change on Water Conservation: A Case Study of Zhangjiakou in Yongding River
2021
The implementation of ecological projects can largely change regional land use patterns, in turn altering the local hydrological process. Articulating these changes and their effects on ecosystem services, such as water conservation, is critical to understanding the impacts of land use activities and in directing future land planning toward regional sustainable development. Taking Zhangjiakou City of the Yongding River as the study area—a region with implementation of various ecological projects—the impact of land use changes on various hydrological components and water conservation capacity from 2000 to 2015 was simulated based on a soil and water assessment tool model (SWAT). An empirical regression model based on partial least squares was established to explore the contribution of different land use changes on water conservation. With special focus on the forest having the most complex effects on the hydrological process, the impacts of forest type and age on the water conservation capacity are discussed on different scales. Results show that between 2000 and 2015, the area of forest, grassland and cultivated land decreased by 0.05%, 0.98% and 1.64%, respectively, which reduces the regional evapotranspiration (0.48%) and soil water content (0.72%). The increase in settlement area (42.23%) is the main reason for the increase in water yield (14.52%). Most land use covered by vegetation has strong water conservation capacity, and the water conservation capacity of the forest is particularly outstanding. Farmland and settlements tend to have a negative effect on water conservation. The water conservation capacity of forest at all scales decreased significantly with the growth of forest (p < 0.05), while the water conservation capacity of different tree species had no significant difference. For the study area, increasing the forest area will be an effective way to improve the water conservation function, planting evergreen conifers can rapidly improve the regional water conservation capacity, while planting deciduous conifers is of great benefit to long-term sustainable development.
Journal Article
A generalized Hosmer–Lemeshow goodness-of-fit test for a family of generalized linear models
by
Loughin, Thomas M.
,
Surjanovic, Nikola
,
Lockhart, Richard A.
in
Asymptotic methods
,
Economics
,
Finance
2024
Generalized linear models (GLMs) are very widely used, but formal goodness-of-fit (GOF) tests for the overall fit of the model seem to be in wide use only for certain classes of GLMs. We develop and apply a new goodness-of-fit test, similar to the well-known and commonly used Hosmer–Lemeshow (HL) test, that can be used with a wide variety of GLMs. The test statistic is a variant of the HL statistic, but we rigorously derive an asymptotically correct sampling distribution using methods of Stute and Zhu (Scand J Stat 29(3):535–545, 2002) and demonstrate its consistency. We compare the performance of our new test with other GOF tests for GLMs, including a naive direct application of the HL test to the Poisson problem. Our test provides competitive or comparable power in various simulation settings and we identify a situation where a naive version of the test fails to hold its size. Our generalized HL test is straightforward to implement and interpret and an R package is publicly available.
Journal Article
On the Semi-Automatic Retrieval of Biophysical Parameters Based on Spectral Index Optimization
by
Delegido, Jesús
,
Verrelst, Jochem
,
Moreno, José
in
Assessments
,
biophysical parameter retrieval
,
empirical regression models
2014
Regression models based on spectral indices are typically empirical formulae enabling the mapping of biophysical parameters derived from Earth Observation (EO) data. Due to its empirical nature, it remains nevertheless uncertain to what extent a selected regression model is the most appropriate one, until all band combinations and curve fitting functions are assessed. This paper describes the application of a Spectral Index (SI) assessment toolbox in the Automated Radiative Transfer Models Operator (ARTMO) package. ARTMO enables semi-automatic retrieval and mapping of biophysical parameters from optical remote sensing observations. The SI toolbox facilitates the assessment of biophysical parameter retrieval accuracy of established as well as new and generic SIs. For instance, based on the SI formulation used, all possible band combinations of formulations with up to ten bands can be defined and evaluated. Several options are available in the SI assessment: calibration/validation data partitioning, the addition of noise and the definition of curve fitting models. To illustrate its functioning, all two-band combinations according to simple ratio (SR) and normalized difference (ND) formulations as well as various fitting functions (linear, exponential, power, logarithmic, polynomial) have been assessed. HyMap imaging spectrometer (430–2490 nm) data obtained during the SPARC-2003 campaign in Barrax, Spain, have been used to extract leaf area index (LAI) and leaf chlorophyll content (LCC) estimates. For both SR and ND formulations the most sensitive regions have been identified for two-band combinations of green (539–570 nm) with longwave SWIR (2421–2453 nm) for LAI (r2: 0.83) and far-red (692 nm) with NIR (1340 nm) or shortwave SWIR (1661–1686 nm) for LCC (r2: 0.93). Polynomial, logarithmic and linear fitting functions led to similar best correlations, though spatial differences emerged when applying the functions to HyMap imagery. We suggest that a systematic SI assessment is a strong requirement in the quality assurance approach for accurate biophysical parameter retrieval.
Journal Article
Mapping and estimating water quality parameters in the Volta Lake's Kpong Headpond of Ghana using regression model and Landsat 8
by
Appiah Boamah, Linda
,
Anornu, Geophrey Kwame
,
Gyamfi, Charles
in
Chlorophyll-a
,
Climate Change
,
Ecology - Environment Studies
2024
Sub-Saharan Africa faces a number of essential issues, including water quality. As such, evaluating the surface water quality of lakes and reservoirs is a crucial part of environmental monitoring and management. Especially in a region where these water bodies serve as a source of livelihood for communities living around them. Water quality parameters (WQPs) are usually taken from the site and sent to the laboratory for measurement and analysis. However, this traditional method is time-consuming, costly, and labor-intensive. Combining geographic information system and remote sensing (RS) allows researchers to analyze WQPs more conveniently. This study, therefore, used RS technology to map and estimate WQPs and correlated it with in-situ measurement. Using the empirical regression model and Landsat 8, WQPs such as chlorophyll-a (Chl-a), total suspended solids (TSS) and turbidity were estimated. The results from RS were correlated with the in-situ measurements of water quality. The results showed that the in-situ Chl-a levels varied from 0.206 to 13.5 mg/L, averaging 5.1 mg/L. The Chl-a values estimated from Landsat 8 had R
2
of 0.883 and 0.853, respectively, for both periods (17 December 2022 and 16 March 2023). The green band (B3) was more instrumental in detecting Chl-a. The in-situ measurement for TSS ranged between 18 and 48 mg/L, with a mean value of 28.7 mg/L. These readings were low and within tolerable bounds of 50 mg/L. High TSS concentrations were found near farms and communities with a significant influx of silt into the surrounding lake. The comparison of in-situ water quality and the reflectance from satellite data showed that the turbidity estimated from the sensor from the two periods has R
2
> 0.65. The study showed that the combination of the Landsat image and in-situ measurement offers great ways to provide timely and affordable estimation from WQPs.
Journal Article
Rape (Brassica napus L.) Growth Monitoring and Mapping Based on Radarsat-2 Time-Series Data
2018
In this study, 27 polarimetric parameters were extracted from Radarsat-2 polarimetric synthetic aperture radar (SAR) at each growth stage of the rape crop. The sensitivity to growth parameters such as stem height, leaf area index (LAI), and biomass were investigated as a function of days after sowing. Based on the sensitivity analysis, five empirical regression models were compared to determine the best model for stem height, LAI, and biomass inversion. Of these five models, quadratic models had higher R2 values than other models in most cases of growth parameter inversions, but when these results were related to physical scattering mechanisms, the inversion results produced overestimation in the performance of some parameters. By contrast, linear and logarithmic models, which had lower R2 values than the quadratic models, had stable performance for growth parameter inversions, particularly in terms of their performance at each growth stage. The best biomass inversion performance was acquired by the volume component of a quadratic model, with an R2 value of 0.854 and root mean square error (RMSE) of 109.93 g m−2. The best LAI inversion was also acquired by a quadratic model, but used the radar vegetation index (Cloude), with an R2 value of 0.8706 and RMSE of 0.56 m2 m−2. Stem height was acquired by scattering angle alpha ( α ) using a logarithmic model, with an R2 of 0.926 value and RMSE of 11.09 cm. The performances of these models were also analysed for biomass estimation at the second growth stage (P2), third growth stage (P3), and fourth growth stage (P4). The results showed that the models built at the P3 stage had better substitutability with the models built during all of the growth stages. From the mapping results, we conclude that a model built at the P3 stage can be used for rape biomass inversion, with 90% of estimation errors being less than 100 g m−2.
Journal Article
Distribution-free testing in linear and parametric regression
2021
Recently, a distribution-free approach for testing parametric hypotheses based on unitary transformations has been suggested in Khmaladze (Ann Stat 41:2979–2993, 2013, Bernoulli 22:563–588, 2016) and further studied in Nguyen (Metrika 80:153–170, 2017) and Roberts (Stat Probab Lett 150:47–53, 2019). In this paper, we show that the transformation takes very simple form in distribution-free testing of linear regression. Then, we extend it to the general parametric regression with vector-valued covariates.
Journal Article