Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
128 result(s) for "Bello, Kevin"
Sort by:
Discovering the effective connectome of the brain with dynamic Bayesian DAG learning
Understanding the complex mechanisms of the brain can be unraveled by extracting the Dynamic Effective Connectome (DEC). Recently, score-based Directed Acyclic Graph (DAG) discovery methods have shown significant improvements in extracting the causal structure and inferring effective connectivity. However, learning DEC through these methods still faces two main challenges: one with the fundamental impotence of high-dimensional dynamic DAG discovery methods and the other with the low quality of fMRI data. In this paper, we introduce Bayesian Dynamic DAG learning with M-matrices Acyclicity characterization (BDyMA) method to address the challenges in discovering DEC. The presented dynamic DAG enables us to discover direct feedback loop edges as well. Leveraging an unconstrained framework in the BDyMA method leads to more accurate results in detecting high-dimensional networks, achieving sparser outcomes, making it particularly suitable for extracting DEC. Additionally, the score function of the BDyMA method allows the incorporation of prior knowledge into the process of dynamic causal discovery which further enhances the accuracy of results. Comprehensive simulations on synthetic data and experiments on Human Connectome Project (HCP) data demonstrate that our method can handle both of the two main challenges, yielding more accurate and reliable DEC compared to state-of-the-art and traditional methods. Additionally, we investigate the trustworthiness of DTI data as prior knowledge for DEC discovery and show the improvements in DEC discovery when the DTI data is incorporated into the process. •We introduce BDyMA to discover dynamic causal structure of high-dimensional networks.•We demonstrate the effectiveness of the BDyMA in comparison to both existing method.•We show that our method enhances the intrasubject and intersubject reliability.•We examine the trustworthiness of DTI data as prior knowledge of DEC discovery.
Structured Prediction: Statistical and Computational Guarantees in Learning and Inference
Structured prediction consists of receiving a structured input and producing a combinatorial structure such as trees, clusters, networks, sequences, permutations, among others. From the computational viewpoint, structured prediction is in general considered intractable because of the size of the output space being exponential in the input size. For instance, in image segmentation tasks, the number of admissible segments is exponential in the number of pixels. A second factor is the combination of the input dimensionality along with the amount of data under availability. In structured prediction it is common to have the input live in a high-dimensional space, which involves to jointly reason about thousands or millions of variables, and at the same time contend with limited amount of data. Thus, learning and inference methods with strong computational and statistical guarantees are desired. The focus of our research is then to propose principled methods for structured prediction that are both polynomial time, i.e., computationally efficient, and require a polynomial number of data samples, i.e., statistically efficient.The main contributions of this thesis are as follows:i. We develop an efficient and principledlearning method of latent variable models for structured prediction under Gaussian perturbations. We derive a Rademacher-based generalization bound and argue that the use of non-convex formulations in learning latent-variable models leads to tighter bounds of the Gibbs decoder distortion.ii. We study the fundamental limits of structured prediction, i.e., we characterize the necessary sample complexity for learning factor graph models in the context of structured prediction. In particular, we show that the finiteness of our novelpair-dimension is necessary for learning. Lastly, we show a connection between the pair-dimension and the VC-dimension—which allows for using existing results on VC-dimension to calculate the pair-dimension.iii. We analyze a generative model based on connected graphs, and find the structural conditions of the graph that allow for the exact recovery of the node labels. In particular, we show that exact recovery is realizable in polynomial time for a large class of graphs.Our analysis is based on convex relaxations, where we thoroughly analyze a semidefinite program and a degree-4 sum-of-squares program. Finally, we extend this model to consider linear constraints (e.g., fairness), and formally explain the effect of the added constraints on the probability of exact recovery.
Improving Topic Coherence Using Entity Extraction Denoising
Managing large collections of documents is an important problem for many areas of science, industry, and culture. Probabilistic topic modeling offers a promising solution. Topic modeling is an unsupervised machine learning method and the evaluation of this model is an interesting problem on its own. Topic interpretability measures have been developed in recent years as a more natural option for topic quality evaluation, emulating human perception of coherence with word sets correlation scores. In this paper, we show experimental evidence of the improvement of topic coherence score by restricting the training corpus to that of relevant information in the document obtained by Entity Recognition. We experiment with job advertisement data and find that with this approach topic models improve interpretability in about 40 percentage points on average. Our analysis reveals as well that using the extracted text chunks, some redundant topics are joined while others are split into more topics. Fine-grained topics observed in models using the whole text are preserved.
Thermal energetics of bats of the family Vespertilionidae: an evolutionary approach
Thermal energetics define the way animals spend energy for thermoregulation. In this regard, numerous studies have determined that body mass (Mb) is the most influential morphological trait affecting the thermal traits in different species of birds and mammals. However, most of the studies have been focused on the basal metabolic rate (BMR), while other thermal traits have been less studied. We addressed this gap by examining thermal variables on bats of the family Vespertilionidae. Using open-flow respirometry, we measured BMR, absolute thermal conductance (C), lower and upper critical temperatures (TLC and TUC), and the breadth of the thermoneutral zone (TNZb) of 15 bat species varying in Mb from ~ 4.0 to 21.0 g from central Mexico. We: 1) combined our empirical data with information gathered from the bibliography and conducted phylogenetic analyses to investigate the relationship between Mb and thermal traits, and 2) mapped the thermal energetic values along the phylogeny to explore how they may have evolved. We found a positive relationship between Mb and BMR and absolute C, and a negative relationship between Mb and TLC and TUC. However, we did not find a relationship between Mb and TNZb in bats. The phylogenetic approach suggested that over the evolutionary history of bats, BMR and C have decreased while TLC and TUC have increased. Our results suggest that adaptive changes in Mb and thermal traits may have influenced the geographical distribution and the use of energy-saving strategies of the different species of bats of the family Vespertilionidae.Competing Interest StatementThe authors have declared no competing interest.
Fairness constraints can help exact inference in structured prediction
Many inference problems in structured prediction can be modeled as maximizing a score function on a space of labels, where graphs are a natural representation to decompose the total score into a sum of unary (nodes) and pairwise (edges) scores. Given a generative model with an undirected connected graph \\(G\\) and true vector of binary labels, it has been previously shown that when \\(G\\) has good expansion properties, such as complete graphs or \\(d\\)-regular expanders, one can exactly recover the true labels (with high probability and in polynomial time) from a single noisy observation of each edge and node. We analyze the previously studied generative model by Globerson et al. (2015) under a notion of statistical parity. That is, given a fair binary node labeling, we ask the question whether it is possible to recover the fair assignment, with high probability and in polynomial time, from single edge and node observations. We find that, in contrast to the known trade-offs between fairness and model performance, the addition of the fairness constraint improves the probability of exact recovery. We effectively explain this phenomenon and empirically show how graphs with poor expansion properties, such as grids, are now capable to achieve exact recovery with high probability. Finally, as a byproduct of our analysis, we provide a tighter minimum-eigenvalue bound than that of Weyl's inequality.
Torpor energetics are related to the interaction between body mass and climate in bats of the family Vespertilionidae
Torpor is an adaptive strategy that allows animals to cope with energy limitations under adverse environmental conditions. In birds and mammals, intrinsic and extrinsic factors such as body mass (Mb) and ambient temperature (Ta) are well established triggers of torpor. Interestingly, the interplay between Mb and climate with different Ta on torpor traits in bats remains unexplored. Using open flow respirometry, we calculated Ta upon entering torpor (Tat), the reduction in torpid metabolic rate relative to the basal metabolic rate (TMRred), the Ta at which torpor metabolic rate reached its minimum (Ta adjust), and minimum torpid metabolic rate (TMRmin) in 11 bat species of the family Vespertilionidae that differ in Mb from warm and cold climates. We also included TMRmin data retrieved through a bibliography review. We tested the effects of Mb and climate on torpor traits using mixed-effect phylogenetic models. All models showed a significant interaction between Mb and climate. This interaction was inversely related to Tat, TMRred, Ta adjust, and positively related to TMRmin. These results are likely explained by the differences in Mb and the metabolic rate of bats from different climates, which may allow individuals to express torpor in places with different Ta. Further studies to assess torpor use in bats of different climates are proposed. The interaction between body mass and climate influences torpor energetics in bats of the family Vespertilionidae. As a result, torpid traits change based on body mass and climate.
(\\text{C}^2\\text{P}\\): Featuring Large Language Models with Causal Reasoning
Causal reasoning is one of the primary bottlenecks that Large Language Models (LLMs) must overcome to attain human-level intelligence. Recent studies indicate that LLMs display near-random performance on reasoning tasks. To address this, we introduce the Causal Chain of Prompting (\\(\\text{C}^2\\text{P}\\)), a reasoning framework that aims to equip current LLMs with causal reasoning capabilities as the first framework of its kind operating autonomously without relying on external tools or modules during both the causal learning and reasoning phases. To evaluate the performance of \\(\\text{C}^2\\text{P}\\), we first demonstrate that reasoning accuracy improved by over \\(30.7\\%\\) and \\(25.9\\%\\) for GPT-4 Turbo and LLaMA 3.1, respectively, when using our framework, compared to the same models without \\(\\text{C}^2\\text{P}\\) on a synthetic benchmark dataset. Then, using few-shot learning of the same LLMs with \\(\\text{C}^2\\text{P}\\), the reasoning accuracy increased by more than \\(20.05\\%\\) and \\(20.89\\%\\), respectively, with as few as ten examples, compared to the corresponding LLMs without \\(\\text{C}^2\\text{P}\\) on the same dataset. To evaluate \\(\\text{C}^2\\text{P}\\) in realistic scenarios, we utilized another benchmark dataset containing natural stories across various fields, including healthcare, medicine, economics, education, social sciences, environmental science, and marketing. The results show improved reasoning when \\(\\text{C}^2\\text{P}\\) is applied, compared to cases where our framework is not used, which often leads to random and hallucinated responses. By showing the improved performance of few-shot learned GPT-4 Turbo and LLaMA 3.1 with \\(\\text{C}^2\\text{P}\\), we demonstrate the generalizability of our framework.
Markov Equivalence and Consistency in Differentiable Structure Learning
Existing approaches to differentiable structure learning of directed acyclic graphs (DAGs) rely on strong identifiability assumptions in order to guarantee that global minimizers of the acyclicity-constrained optimization problem identifies the true DAG. Moreover, it has been observed empirically that the optimizer may exploit undesirable artifacts in the loss function. We explain and remedy these issues by studying the behavior of differentiable acyclicity-constrained programs under general likelihoods with multiple global minimizers. By carefully regularizing the likelihood, it is possible to identify the sparsest model in the Markov equivalence class, even in the absence of an identifiable parametrization. We first study the Gaussian case in detail, showing how proper regularization of the likelihood defines a score that identifies the sparsest model. Assuming faithfulness, it also recovers the Markov equivalence class. These results are then generalized to general models and likelihoods, where the same claims hold. These theoretical results are validated empirically, showing how this can be done using standard gradient-based optimizers, thus paving the way for differentiable structure learning under general models and losses.
Identifying General Mechanism Shifts in Linear Causal Representations
We consider the linear causal representation learning setting where we observe a linear mixing of \\(d\\) unknown latent factors, which follow a linear structural causal model. Recent work has shown that it is possible to recover the latent factors as well as the underlying structural causal model over them, up to permutation and scaling, provided that we have at least \\(d\\) environments, each of which corresponds to perfect interventions on a single latent node (factor). After this powerful result, a key open problem faced by the community has been to relax these conditions: allow for coarser than perfect single-node interventions, and allow for fewer than \\(d\\) of them, since the number of latent factors \\(d\\) could be very large. In this work, we consider precisely such a setting, where we allow a smaller than \\(d\\) number of environments, and also allow for very coarse interventions that can very coarsely \\textit{change the entire causal graph over the latent factors}. On the flip side, we relax what we wish to extract to simply the \\textit{list of nodes that have shifted between one or more environments}. We provide a surprising identifiability result that it is indeed possible, under some very mild standard assumptions, to identify the set of shifted nodes. Our identifiability proof moreover is a constructive one: we explicitly provide necessary and sufficient conditions for a node to be a shifted node, and show that we can check these conditions given observed data. Our algorithm lends itself very naturally to the sample setting where instead of just interventional distributions, we are provided datasets of samples from each of these distributions. We corroborate our results on both synthetic experiments as well as an interesting psychometric dataset. The code can be found at https://github.com/TianyuCodings/iLCS.
DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization
The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles. In this work, we propose a new acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. To deal with the inherent asymmetries of a DAG, we relate the domain of our log-det characterization to the set of \\(\\textit{M-matrices}\\), which is a key difference to the classical log-det function defined over the cone of positive definite matrices. Similar to acyclicity functions previously proposed, our characterization is also exact and differentiable. However, when compared to existing characterizations, our log-det function: (1) Is better at detecting large cycles; (2) Has better-behaved gradients; and (3) Its runtime is in practice about an order of magnitude faster. From the optimization side, we drop the typically used augmented Lagrangian scheme and propose DAGMA (\\(\\textit{DAGs via M-matrices for Acyclicity}\\)), a method that resembles the central path for barrier methods. Each point in the central path of DAGMA is a solution to an unconstrained problem regularized by our log-det function, then we show that at the limit of the central path the solution is guaranteed to be a DAG. Finally, we provide extensive experiments for \\(\\textit{linear}\\) and \\(\\textit{nonlinear}\\) SEMs and show that our approach can reach large speed-ups and smaller structural Hamming distances against state-of-the-art methods. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/dagma.