Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Reading Level
      Reading Level
      Clear All
      Reading Level
  • Content Type
      Content Type
      Clear All
      Content Type
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Item Type
    • Is Full-Text Available
    • Subject
    • Publisher
    • Source
    • Donor
    • Language
    • Place of Publication
    • Contributors
    • Location
65,813 result(s) for "Decision Processes"
Sort by:
Markov chains and decision processes for engineers and managers
\"This book presents an introduction to finite Markov chains and Markov decision processes, with applications in engineering and management. It introduces discrete-time, finite-state Markov chains, and Markov decision processes. The text describes both algorithms and applications, enabling students to understand the logical basis for the algorithms and be able to apply them. The applications address problems in government, business, and nonprofit sectors. The author uses Markov models to approximate the random behavior of complex systems in diverse areas, such as management, production, science, education, health services, finance, and marketing\"-- Provided by publisher.
Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning
We consider scenarios where a swarm of unmanned vehicles (UxVs) seek to satisfy a number of diverse, spatially distributed objectives. The UxVs strive to determine an efficient plan to service the objectives while operating in a coordinated fashion. We focus on developing autonomous high-level planning, where low-level controls are leveraged from previous work in distributed motion, target tracking, localization, and communication. We rely on the use of state and action abstractions in a Markov decision processes framework to introduce a hierarchical algorithm, Dynamic Domain Reduction for Multi-Agent Planning, that enables multi-agent planning for large multi-objective environments. Our analysis establishes the correctness of our search procedure within specific subsets of the environments, termed ‘sub-environment’ and characterizes the algorithm performance with respect to the optimal trajectories in single-agent and sequential multi-agent deployment scenarios using tools from submodularity. Simulated results show significant improvement over using a standard Monte Carlo tree search in an environment with large state and action spaces.
Explicit Explore, Exploit, or Escape (E4): near-optimal safety-constrained reinforcement learning in polynomial time
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to learn a desired behaviour. When RL agents are deployed in real world environments, safety is of primary concern. Constrained Markov decision processes (CMDPs) can provide long-term safety constraints; however, the agent may violate the constraints in an effort to explore its environment. This paper proposes a model-based RL algorithm called Explicit Explore, Exploit, or Escape ( E 4 ), which extends the Explicit Explore or Exploit ( E 3 ) algorithm to a robust CMDP setting. E 4 explicitly separates exploitation, exploration, and escape CMDPs, allowing targeted policies for policy improvement across known states, discovery of unknown states, as well as safe return to known states. E 4 robustly optimises these policies on the worst-case CMDP from a set of CMDP models consistent with the empirical observations of the deployment environment. Theoretical results show that E 4 finds a near-optimal constraint-satisfying policy in polynomial time whilst satisfying safety constraints throughout the learning process. We then discuss E 4 as a practical algorithmic framework, including robust-constrained offline optimisation algorithms, the design of uncertainty sets for the transition dynamics of unknown states, and how to further leverage empirical observations and prior knowledge to relax some of the worst-case assumptions underlying the theory.
Decision‐Making Preferences in Times of Crisis
During crises, understanding political decision-making processes and evaluating related preferences are key to the legitimacy of political decisions. Our research focuses on preferences in decision-making processes in times of crisis through the analysis of the representational style most preferred by voters: that is, whether they prefer representation of the public good by the representatives, the party lines, the involvement of experts, or the incorporation of voters’ interests. Within the framework of representative democracy, these decisions are mediated by representatives whose representational style determines whose interest and opinion decision-making processes integrate. In our analysis, we examined representative styles in the context of three different types of crises: economic, social, and environmental. Our results indicate that the type of crisis is indifferent when it comes to preferred political decision-making processes, as Hungarian voters tend to favor processes where they are being consulted by the representatives across different scenarios. Representatives’ commitment to party lines is disfavored when making political decisions and we observed there is no clear preference regarding the involvement of experts in political decisions in times of crisis. These observed preferences strongly contradict the prevailing “strong party discipline” in Hungary. This deviation accentuates both weakening representative linkages and the importance of the performative elements of representation feeding into the populist characteristic of Hungarian democracy.
Decision-making under uncertainty: beyond probabilities
This position paper reflects on the state-of-the-art in decision-making under uncertainty. A classical assumption is that probabilities can sufficiently capture all uncertainty in a system. In this paper, the focus is on the uncertainty that goes beyond this classical interpretation, particularly by employing a clear distinction between aleatoric and epistemic uncertainty. The paper features an overview of Markov decision processes (MDPs) and extensions to account for partial observability and adversarial behavior. These models sufficiently capture aleatoric uncertainty, but fail to account for epistemic uncertainty robustly. Consequently, we present a thorough overview of so-called uncertainty models that exhibit uncertainty in a more robust interpretation. We show several solution techniques for both discrete and continuous models, ranging from formal verification, over control-based abstractions, to reinforcement learning. As an integral part of this paper, we list and discuss several key challenges that arise when dealing with rich types of uncertainty in a model-based fashion.
Analysis of a time–cost trade-off in a resource-constrained GERT project scheduling problem using the Markov decision process
Nowadays the advent of new types of projects such as startups, maintenance, and education make a revolution in project management, so that, classical project scheduling methods are incapable in analyzing of these stochastic projects. This study considers a time–cost trade-off project scheduling problem, where the structure of the project is uncertain. To deal with the uncertainties, we implemented Graphical Evaluation and Review Technique (GERT). The main aim of the study is to balance time and the amount of a non-renewable resource allocated to each activity considering the finite-time horizon and resource limitations. To preserve the generality of the model, we considered both discrete and continuous distribution functions for the activity’s duration. From a methodological standpoint, we proposed an analytical approach based on the Markov Decision Process (MDP) and Semi-Markov Decision Process (SMDP) to find the probability distribution of project makespan. These models are solved using the value iteration and a finite-horizon Linear Programming (LP) model. Two randomly generated examples explain the value iteration for models in detail. Furthermore, seven example groups each with five instances are adopted from a well-known data set, PSPLIB, to validate the efficiency of the proposed models in contrast to the two extensively-studied methods, Genetic algorithm (GA) and Monte-Carlo simulation. The convergence of the GA and simulation results to those of MDP and SMDP represent the efficiency of the proposed models. Besides, conducting a sensitivity analysis on the project completion probability with respect to the available resource, gives a good insight to managers to plan their resources.
Recent Advances in Deep Reinforcement Learning Applications for Solving Partially Observable Markov Decision Processes (POMDP) Problems: Part 1—Fundamentals and Applications in Games, Robotics and Natural Language Processing
The first part of a two-part series of papers provides a survey on recent advances in Deep Reinforcement Learning (DRL) applications for solving partially observable Markov decision processes (POMDP) problems. Reinforcement Learning (RL) is an approach to simulate the human’s natural learning process, whose key is to let the agent learn by interacting with the stochastic environment. The fact that the agent has limited access to the information of the environment enables AI to be applied efficiently in most fields that require self-learning. Although efficient algorithms are being widely used, it seems essential to have an organized investigation—we can make good comparisons and choose the best structures or algorithms when applying DRL in various applications. In this overview, we introduce Markov Decision Processes (MDP) problems and Reinforcement Learning and applications of DRL for solving POMDP problems in games, robotics, and natural language processing. A follow-up paper will cover applications in transportation, communications and networking, and industries.