Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
156 result(s) for "approximate bayesian inference"
Sort by:
Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations
Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality, which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged.
explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach
Continuously indexed Gaussian fields (GFs) are the most important ingredient in spatial statistical modelling and geostatistics. The specification through the covariance function gives an intuitive interpretation of the field properties. On the computational side, GFs are hampered with the big n problem, since the cost of factorizing dense matrices is cubic in the dimension. Although computational power today is at an all time high, this fact seems still to be a computational bottleneck in many applications. Along with GFs, there is the class of Gaussian Markov random fields (GMRFs) which are discretely indexed. The Markov property makes the precision matrix involved sparse, which enables the use of numerical algorithms for sparse matrices, that for fields in only use the square root of the time required by general algorithms. The specification of a GMRF is through its full conditional distributions but its marginal properties are not transparent in such a parameterization. We show that, using an approximate stochastic weak solution to (linear) stochastic partial differential equations, we can, for some GFs in the Matérn class, provide an explicit link, for any triangulation of , between GFs and GMRFs, formulated as a basis function representation. The consequence is that we can take the best from the two worlds and do the modelling by using GFs but do the computations by using GMRFs. Perhaps more importantly, our approach generalizes to other covariance functions generated by SPDEs, including oscillating and non-stationary GFs, as well as GFs on manifolds. We illustrate our approach by analysing global temperature data with a non-stationary model defined on a sphere.
PID Control as a Process of Active Inference with Linear Generative Models
In the past few decades, probabilistic interpretations of brain functions have become widespread in cognitive science and neuroscience. In particular, the free energy principle and active inference are increasingly popular theories of cognitive functions that claim to offer a unified understanding of life and cognition within a general mathematical framework derived from information and control theory, and statistical mechanics. However, we argue that if the active inference proposal is to be taken as a general process theory for biological systems, it is necessary to understand how it relates to existing control theoretical approaches routinely used to study and explain biological systems. For example, recently, PID (Proportional-Integral-Derivative) control has been shown to be implemented in simple molecular systems and is becoming a popular mechanistic explanation of behaviours such as chemotaxis in bacteria and amoebae, and robust adaptation in biochemical networks. In this work, we will show how PID controllers can fit a more general theory of life and cognition under the principle of (variational) free energy minimisation when using approximate linear generative models of the world. This more general interpretation also provides a new perspective on traditional problems of PID controllers such as parameter tuning as well as the need to balance performances and robustness conditions of a controller. Specifically, we then show how these problems can be understood in terms of the optimisation of the precisions (inverse variances) modulating different prediction errors in the free energy functional.
Sparse Bayesian Neural Networks: Bridging Model and Parameter Uncertainty through Scalable Variational Inference
Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using a Bayesian approach: parameter and prediction uncertainties become easily available, facilitating more rigorous statistical analysis. Furthermore, prior knowledge can be incorporated. However, the construction of scalable techniques that combine both structural and parameter uncertainty remains a challenge. In this paper, we apply the concept of model uncertainty as a framework for structural learning in BNNs and, hence, make inferences in the joint space of structures/models and parameters. Moreover, we suggest an adaptation of a scalable variational inference approach with reparametrization of marginal inclusion probabilities to incorporate the model space constraints. Experimental results on a range of benchmark datasets show that we obtain comparable accuracy results with the competing models, but based on methods that are much more sparse than ordinary BNNs.
A combined statistical and machine learning approach for spatial prediction of extreme wildfire frequencies and sizes
Motivated by the Extreme Value Analysis 2021 (EVA 2021) data challenge, we propose a method based on statistics and machine learning for the spatial prediction of extreme wildfire frequencies and sizes. This method is tailored to handle large datasets, including missing observations. Our approach relies on a four-stage, bivariate, sparse spatial model for high-dimensional zero-inflated data that we develop using stochastic partial differential equations (SPDE), allowing sparse precision matrices for the latent processes. In Stage 1, the observations are separated in zero/nonzero categories and modeled using a two-layered hierarchical Bayesian sparse spatial model to estimate the probabilities of these two categories. In Stage 2, we first obtain empirical estimates of the spatially-varying mean and variance profiles across the spatial locations for the positive observations and smooth those estimates using fixed rank kriging. This approximate Bayesian inference method is employed to avoid the high computational burden of large spatial data modeling using spatially-varying coefficients. In Stage 3, we further model the standardized log-transformed positive observations from the second stage using a sparse bivariate spatial Gaussian process. The Gaussian distribution assumption for wildfire counts developed in the third stage is computationally effective but erroneous. Thus, in Stage 4, the predicted exceedance probabilities are post-processed using Random Forests. We draw posterior inference for Stages 1 and 3 using Markov chain Monte Carlo (MCMC) sampling. We then create a cross-validation scheme for the artificially generated gaps and compare the EVA 2021 prediction scores of the proposed model to those obtained using some competitors.
Bat echolocation call identification for biodiversity monitoring: a probabilistic approach
Bat echolocation call identification methods are important in developing efficient cost-effective methods for large-scale bioacoustic surveys for global biodiversity monitoring and conservation planning. Such methods need to provide interpretable probabilistic predictions of species since they will be applied across many different taxa in a diverse set of applications and environments. We develop such a method using a multinomial probit likelihood with independent Gaussian process priors and study its feasibility on a data set from an on-going study of 21 species, five families and 1800 bat echolocation calls collected from Mexico, a hotspot of bat biodiversity. We propose an efficient approximate inference scheme based on the expectation propagation algorithm and observe that the overall methodology significantly improves on currently adopted approaches to bat call classification by providing an approach which can be easily generalized across different species and call types and is fully probabilistic. Implementation of this method has the potential to provide robust species identification tools for biodiversity acoustic bat monitoring programmes across a range of taxa and spatial scales.
Sparse probit linear mixed model
Linear mixed models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting for various confounding factors such as age, ethnicity and population structure. Formulated as models for linear regression, LMMs have been restricted to continuous phenotypes. We introduce the sparse probit linear mixed model (Probit-LMM), where we generalize the LMM modeling paradigm to binary phenotypes. As a technical challenge, the model no longer possesses a closed-form likelihood function. In this paper, we present a scalable approximate inference algorithm that lets us fit the model to high-dimensional data sets. We show on three real-world examples from different domains that in the setup of binary labels, our algorithm leads to better prediction accuracies and also selects features which show less correlation with the confounding factors.
Real-Time Semiparametric Regression
We develop algorithms for performing semiparametric regression analysis in real time, with data processed as it is collected and made immediately available via modern telecommunications technologies. Our definition of semiparametric regression is quite broad and includes, as special cases, generalized linear mixed models, generalized additive models, geostatistical models, wavelet nonparametric regression models and their various combinations. Fast updating of regression fits is achieved by couching semiparametric regression into a Bayesian hierarchical model or, equivalently, graphical model framework and employing online mean field variational ideas. An Internet site attached to this article, realtime-semiparametric-regression.net, illustrates the methodology for continually arriving stock market, real estate, and airline data. Flexible real-time analyses based on increasingly ubiquitous streaming data sources stand to benefit. This article has online supplementary material.
Spatiotemporal Dynamics of Hantavirus Cardiopulmonary Syndrome Transmission Risk in Brazil
Background: Hantavirus disease in humans is rare but frequently lethal in the Neotropics. Several abundant and widely distributed Sigmodontinae rodents are the primary hosts of Orthohantavirus and, in combination with other factors, these rodents can shape hantavirus disease. Here, we assessed the influence of host diversity, climate, social vulnerability and land use change on the risk of hantavirus disease in Brazil over 24 years. Methods: Landscape variables (native forest, forestry, sugarcane, maize and pasture), climate (temperature and precipitation), and host biodiversity (derived through niche models) were used in spatiotemporal models, using the 5570 Brazilian municipalities as units of analysis. Results: Amounts of native forest and sugarcane, combined with temperature, were the most important factors influencing the increase of disease risk. Population at risk (rural workers) and rodent host diversity also had a positive effect on disease risk. Conclusions: Land use change—especially the conversion of native areas to sugarcane fields—can have a significant impact on hantavirus disease risk, likely by promoting the interaction between the people and the infected rodents. Our results demonstrate the importance of understanding the interactions between landscape change, rodent diversity, and hantavirus disease incidence, and suggest that land use policy should consider disease risk. Meanwhile, our risk map can be used to help allocate preventive measures to avoid disease.
Bayesian approach to estimate the biomass of anchovies off the coast of Perú
The Northern Humboldt Current System (NHCS) is the world's most productive ecosystem in terms of fish. In particular, the Peruvian anchovy (Engraulis ringens) is the major prey of the main top predators, like seabirds, fish, humans, and other mammals. In this context, it is important to understand the dynamics of the anchovy distribution to preserve it as well as to exploit its economic capacities. Using the data collected by the “Instituto del Mar del Perú” (IMARPE) during a scientific survey in 2005, we present a statistical analysis that has as main goals: (i) to adapt to the characteristics of the sampled data, such as spatial dependence, high proportions of zeros and big size of samples; (ii) to provide important insights on the dynamics of the anchovy population; and (iii) to propose a model for estimation and prediction of anchovy biomass in the NHCS offshore from Perú. These data were analyzed in a Bayesian framework using the integrated nested Laplace approximation (INLA) method. Further, to select the best model and to study the predictive power of each model, we performed model comparisons and predictive checks, respectively. Finally, we carried out a Bayesian spatial influence diagnostic for the preferred model.