Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
7,259
result(s) for
"Multivariate modelling"
Sort by:
Most published selection gradients are underestimated
by
Westneat, David F.
,
Araya-Ajoy, Yimen G.
,
Dingemanse, Niels Jeroen
in
Bias
,
Estimates
,
measurement error
2021
Ecologists and evolutionary biologists routinely estimate selection gradients. Most researchers seek to quantify selection on individual phenotypes, regardless of whether fixed or repeatedly expressed traits are studied. Selection gradients estimated to address such questions are attenuated unless analyses account for measurement error and biological sources of within-individual variation. Estimates of standardized selection gradients published in Evolution between 2010 and 2019 were primarily based on traits measured once (59% of 325 estimates). We show that those are attenuated: bias increases with decreasing repeatability but differently for linear versus nonlinear gradients. Others derived individual-mean trait values prior to analyses (41%), typically using few repeats per individual, which does not remove bias. We evaluated three solutions, all requiring repeated measures: (i) correcting gradients derived from classic models using estimates of trait correlations and repeatabilities, (ii) multivariate mixedeffects models, previously used for estimating linear gradients (seven estimates, 2%), which we expand to nonlinear analyses, and (iii) errors-in-variables models that account for within-individual variance, and are rarely used in selection studies. All approaches produced accurate estimates regardless of repeatability and type of gradient, however, errors-in-variables models produced more precise estimates and may thus be preferable.
Journal Article
Biomarkers for physical frailty and sarcopenia: state of the science and future developments
2015
Physical frailty and sarcopenia are two common and largely overlapping geriatric conditions upstream of the disabling cascade. The lack of a unique operational definition for physical frailty and sarcopenia and the complex underlying pathophysiology make the development of biomarkers for these conditions extremely challenging. Indeed, the current definitional ambiguities of physical frailty and sarcopenia, together with their heterogeneous clinical manifestations, impact the accuracy, specificity, and sensitivity of individual biomarkers proposed so far. In this review, the current state of the art in the development of biomarkers for physical frailty and sarcopenia is presented. A novel approach for biomarker identification and validation is also introduced that moves from the ‘one fits all’ paradigm to a multivariate methodology.
Journal Article
Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms
by
Browne, William J.
,
Goldstein, Harvey
,
Carpenter, James R.
in
Bayesian analysis
,
Bayesian method
,
Covariance
2014
The paper extends existing models for multilevel multivariate data with mixed response types to handle quite general types and patterns of missing data values in a wide range of multilevel generalized linear models. It proposes an efficient Bayesian modelling approach that allows missing values in covariates, including models where there are interactions or other functions of covariates such as polynomials. The procedure can also be used to produce multiply imputed complete data sets. A simulation study is presented as well as the analysis of a longitudinal data set. The paper also shows how existing multiprocess models for handling endogeneity can be extended by the framework proposed.
Journal Article
Plasma metabolites associated with type 2 diabetes in a Swedish population: a case–control study nested in a prospective cohort
2018
Aims/hypothesisThe aims of the present work were to identify plasma metabolites that predict future type 2 diabetes, to investigate the changes in identified metabolites among individuals who later did or did not develop type 2 diabetes over time, and to assess the extent to which inclusion of predictive metabolites could improve risk prediction.MethodsWe established a nested case–control study within the Swedish prospective population-based Västerbotten Intervention Programme cohort. Using untargeted liquid chromatography-MS metabolomics, we analysed plasma samples from 503 case–control pairs at baseline (a median time of 7 years prior to diagnosis) and samples from a subset of 187 case–control pairs at 10 years of follow-up. Discriminative metabolites between cases and controls at baseline were optimally selected using a multivariate data analysis pipeline adapted for large-scale metabolomics. Conditional logistic regression was used to assess associations between discriminative metabolites and future type 2 diabetes, adjusting for several known risk factors. Reproducibility of identified metabolites was estimated by intra-class correlation over the 10 year period among the subset of healthy participants; their systematic changes over time in relation to diagnosis among those who developed type 2 diabetes were investigated using mixed models. Risk prediction performance of models made from different predictors was evaluated using area under the receiver operating characteristic curve, discrimination improvement index and net reclassification index.ResultsWe identified 46 predictive plasma metabolites of type 2 diabetes. Among novel findings, phosphatidylcholines (PCs) containing odd-chain fatty acids (C19:1 and C17:0) and 2-hydroxyethanesulfonate were associated with the likelihood of developing type 2 diabetes; we also confirmed previously identified predictive biomarkers. Identified metabolites strongly correlated with insulin resistance and/or beta cell dysfunction. Of 46 identified metabolites, 26 showed intermediate to high reproducibility among healthy individuals. Moreover, PCs with odd-chain fatty acids, branched-chain amino acids, 3-methyl-2-oxovaleric acid and glutamate changed over time along with disease progression among diabetes cases. Importantly, we found that a combination of five of the most robustly predictive metabolites significantly improved risk prediction if added to models with an a priori defined set of traditional risk factors, but only a marginal improvement was achieved when using models based on optimally selected traditional risk factors.Conclusions/interpretationPredictive metabolites may improve understanding of the pathophysiology of type 2 diabetes and reflect disease progression, but they provide limited incremental value in risk prediction beyond optimal use of traditional risk factors.
Journal Article
A generative model for evaluating missing data methods in large epidemiological cohorts
by
Smith, Stephen M.
,
Radosavljević, Lav
,
Nichols, Thomas E.
in
Algorithms
,
Biobanks
,
Biological Specimen Banks - statistics & numerical data
2025
Background
The potential value of large scale datasets is constrained by the ubiquitous problem of missing data, arising in either a structured or unstructured fashion. When imputation methods are proposed for large scale data, one limitation is the simplicity of existing evaluation methods. Specifically, most evaluations create synthetic data with only a simple, unstructured missing data mechanism which does not resemble the missing data patterns found in real data. For example, in the UK Biobank missing data tends to appear in blocks, because non-participation in one of the sub-studies leads to missingness for all sub-study variables.
Methods
We propose a tool for generating mixed type missing data mimicking key properties of a given real large scale epidemiological data set with both structured and unstructured missingness while accounting for informative missingness. The process involves identifying sub-studies using hierarchical clustering of missingness patterns and modelling the dependence of inter-variable correlation and co-missingness patterns.
Results
On the UK Biobank brain imaging cohort, we identify several large blocks of missing data. We demonstrate the use of our tool for evaluating several imputation methods, showing modest accuracy of imputation overall, with iterative imputation having the best performance. We compare our evaluations based on synthetic data to an exemplar study which includes variable selection on a single real imputed dataset, finding only small differences between the imputation methods though with iterative imputation leading to the most informative selection of variables.
Conclusions
We have created a framework for simulating large scale data with that captures the complexities of the inter-variable dependence as well as structured and unstructured informative missingness. Evaluations using this framework highlight the immense challenge of data imputation in this setting and the need for improved missing data methods.
Journal Article
Soil Organic Carbon Content Prediction Using Soil-Reflected Spectra: A Comparison of Two Regression Methods
by
Moreira, Luis Clenio Jario
,
Costa, Mirian Cristina Gomes
,
Teixeira, Adunias dos Santos
in
absorption
,
Arid regions
,
Arid zones
2021
Quantifying the organic carbon content of soil over large areas is essential for characterising the soil and the effects of its management. However, analytical methods can be laborious and costly. Reflectance spectroscopy is a well-established and widespread method for estimating the chemical-element content of soils. The aim of this study was to estimate the soil organic carbon (SOC) content using hyperspectral remote sensing. The data were from soils from two localities in the semi-arid region of Brazil. The spectral reflectance factors of the collected soil samples were recorded at wavelengths ranging from 350–2500 nm. Pre-processing techniques were employed, including normalisation, Savitzky–Golay smoothing and first-order derivative analysis. The data (n = 65) were examined both jointly and by soil class, and subdivided into calibration and validation to independently assess the performance of the linear methods. Two multivariate models were calibrated using the SOC content estimated in the laboratory by principal component regression (PCR) and partial least squares regression (PLSR). The study showed significant success in predicting the SOC with transformed and untransformed data, yielding acceptable-to-excellent predictions (with the performance-to-deviation ratio ranging from 1.40–3.38). In general, the spectral reflectance factors of the soils decreased with the increasing levels of SOC. PLSR was considered more robust than PCR, whose wavelengths from 354 to 380 nm, 1685, 1718, 1757, 1840, 1876, 1880, 2018, 2037, 2042, and 2057 nm showed outstanding absorption characteristics between the predicted models. The results found here are of significant practical value for estimating SOC in Neosols and Cambisols in the semi-arid region of Brazil using VIS-NIR-SWIR spectroscopy.
Journal Article
A general modelling framework for multivariate disease mapping
2013
This paper deals with multivariate disease mapping. We propose a novel framework that encompasses most of the models already proposed. Our framework starts with a simple identity, reformulating Kronecker products of covariance matrices as simple matrix products. This formula is computationally convenient, and its generalizations reproduce most of the proposals in the disease mapping literature. Use of the identity leads to a flexible, general and computationally convenient modelling framework, making it possible to combine spatial dependence structures and different relationships between diseases with limited effort. Moreover, as the proposed modelling framework covers most of the Gaussian Markov random field-based multivariate disease mapping models in the literature, it allows comparison of all these models in a common context, thus helping us to understand them better.
Journal Article
Fables of the Past
2020
Prehistoric landscape reconstructions are still considered an unsolved methodological issue in archaeological research, and this includes the perception and transformation of an individual landscape in relation to situational and local ecosystem performances. Which parts of the landscape offered the potential for land-use and which areas were rather unsuitable due to a variety of environmental preconditions? The modern perception of the archaeological record that is distributed in the modern landscape does not necessarily represent a realistic dispersal of past human activity, but rather reflects the current state of archaeological research and modern land-use strategies. This contribution provides a critical assessment of spatial analyses of large and unstructured archaeological datasets and the non-reconstructibility of past, individually perceived palaeolandscapes.
Journal Article
Modeling Asymmetric Dependence Structure of Air Pollution Characteristics: A Vine Copula Approach
by
Abu Bakar, Sakhinah
,
Alias, Mohd Almie
,
Ismail, Mohd Sabri
in
Air pollution
,
Air quality management
,
asymmetric copula
2024
Contaminated air is unhealthy for people to breathe and live in. To maintain the sustainability of clean air, air pollution must be analyzed and controlled, especially after unhealthy events. To do so, the characteristics of unhealthy events, namely intensity, duration, and severity are studied using multivariate modeling. In this study, the vine copula approach is selected to study the characteristics data. Vine copula is chosen here because it is more potent than the standard multivariate distributions, and multivariate copulas, especially in modeling the tails related to extreme events. Here, all nine different vine copulas are analyzed and compared based on model fitting and the comparison of models. In model fitting, the best model obtained is Rv123-Joint-MLE, a model with a root nodes sequence of 123, and optimized using the joint maximum likelihood. The components for the best model are the Tawn type 1 and Rotated Tawn type 1 180 degrees representing the pair copulas of (intensity, duration), and (intensity, severity), respectively, with the Survival Gumbel for the conditional pair copula of (duration, severity; intensity). Based on the best model, the tri-variate dependence structure of the intensity, duration, and severity relationship is positively correlated, skewed, and follows an asymmetric distribution. This indicates that the characteristic’s, including intensity, duration, and severity, tend to increase together. Using comparison tests, the best model is significantly different from others, whereas only two models are quite similar. This shows that the best model is well-fitted, compared to most models. Overall, this paper highlights the capability of vine copula in modeling the asymmetric dependence structure of air pollution characteristics, where the obtained model has a better potential to become a tool to assess the risks of extreme events in future work.
Journal Article
Groundwater Suitability for Drinking and Irrigation Using Water Quality Indices and Multivariate Modeling in Makkah Al-Mukarramah Province, Saudi Arabia
by
El Osta, Maged
,
Masoud, Milad
,
Elsayed, Salah
in
absorption
,
Agricultural production
,
Analysis
2022
Water shortage and quality are major issues in many places, particularly arid and semi-arid regions such as Makkah Al-Mukarramah province, Saudi Arabia. The current work was conducted to examine the geochemical mechanisms influencing the chemistry of groundwater and assess groundwater resources through several water quality indices (WQIs), GIS methods, and the partial least squares regression model (PLSR). For that, 59 groundwater wells were tested for different physical and chemical parameters using conventional analytical procedures. The results showed that the average content of ions was as follows: Na+ > Ca2+ > Mg 2+ > K+ and Cl− > SO42− > HCO32− > NO3− > CO3−. Under the stress of evaporation and saltwater intrusion associated with the reverse ion exchange process, the predominant hydrochemical facies were Ca-HCO3, Na-Cl, mixed Ca-Mg-Cl-SO4, and Na-Ca-HCO3. The drinking water quality index (DWQI) has indicated that only 5% of the wells were categorized under good to excellent for drinking while the majority (95%) were poor to unsuitable for drinking, and required appropriate treatment. Furthermore, the irrigation water quality index (IWQI) has indicated that 45.5% of the wells were classified under high to severe restriction for agriculture, and can be utilized only for high salt tolerant plants. The majority (54.5%) were deemed moderate to no restriction for irrigation, with no toxicity concern for most plants. Agriculture indicators such as total dissolved solids (TDS), potential salinity (PS), sodium absorption ratio (SAR), and residual sodium carbonate (RSC) had mean values of 2572.30, 33.32, 4.84, and −21.14, respectively. However, the quality of the groundwater in the study area improves with increased rainfall and thus recharging the Quaternary aquifer. The PLSR models, which are based on physicochemical characteristics, have been shown to be the most efficient as alternative techniques for determining the six WQIs. For instance, the PLSR models of all IWQs had determination coefficients values of R2 ranging between 0.848 and 0.999 in the Cal., and between 0.848 and 0.999 in the Val. datasets, and had model accuracy varying from 0.824 to 0.999 in the Cal., and from 0.817 to 0.989 in the Val. datasets. In conclusion, the combination of physicochemical parameters, WQIs, and multivariate modeling with statistical analysis and GIS tools is a successful and adaptable methodology that provides a comprehensive picture of groundwater quality and governing mechanisms.
Journal Article