Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
292 result(s) for "backwards induction"
Sort by:
EVERY CHOICE FUNCTION IS BACKWARDS-INDUCTION RATIONALIZABLE
A choice function is backwards-induction rationalizable if there exists a finite perfect-information extensive-form game such that for each subset of alternatives, the backwards-induction outcome of the restriction of the game to that subset of alternatives coincides with the choice from that subset. We prove that every choice function is backwards-induction rationalizable.
A Backwards Induction Framework for Quantifying the Option Value of Smart Charging of Electric Vehicles and the Risk of Stranded Assets under Uncertainty
The anticipated electrification of the transport sector may lead to significant increase in the future peak electricity demand, resulting in potential violations of network constraints. As a result, a considerable amount of network reinforcement may be required in order to ensure that the expected additional demand from electric vehicles that are to be connected will be safely accommodated. In this paper we present the Backwards Induction Framework (BIF), which we use for identifying the optimal investment decisions, for calculating the option value of smart charging of EV and the cost of stranded assets; these concepts are crystallized through illustrative case studies. Sensitivity analyses depict how the option value of smart charging and the optimal solution are affected by key factors such as the social cost associated with not accommodating the full EV capacity, the flexibility of smart charging, and the scenario probabilities. Moreover, the BIF is compared with the Stochastic Optimization Framework and key insights are drawn.
TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES
Dynamic treatment regimes (DTRs) are sequences of treatment decision rules, in which treatment may be adapted over time in response to the changing course of an individual. Motivated by the substance use disorder (SUD) study, we propose a tree-based reinforcement learning (T-RL) method to directly estimate optimal DTRs in a multi-stage multi-treatment setting. At each stage, T-RL builds an unsupervised decision tree that directly handles the problem of optimization with multiple treatment comparisons, through a purity measure constructed with augmented inverse probability weighted estimators. For the multiple stages, the algorithm is implemented recursively using backward induction. By combining semiparametric regression with flexible tree-based learning, T-RL is robust, efficient and easy to interpret for the identification of optimal DTRs, as shown in the simulation studies. With the proposed method, we identify dynamic SUD treatment regimes for adolescents.
Doubly-robust dynamic treatment regimen estimation via weighted least squares
Personalized medicine is a rapidly expanding area of health research wherein patient level information is used to inform their treatment. Dynamic treatment regimens (DTRs) are a means of formalizing the sequence of treatment decisions that characterize personalized management plans. Identifying the DTR which optimizes expected patient outcome is of obvious interest and numerous methods have been proposed for this purpose. We present a new approach which builds on two established methods: Q-learning and G-estimation, offering the doubly robust property of the latter but with ease of implementation much more akin to the former. We outline the underlying theory, provide simulation studies that demonstrate the double-robustness and efficiency properties of our approach, and illustrate its use on data from the Promotion of Breastfeeding Intervention Trial.
Constructing dynamic treatment regimes over indefinite time horizons
Existing methods for estimating optimal dynamic treatment regimes are limited to cases where a utility function is optimized over a fixed time period. We develop an estimation procedure for the optimal dynamic treatment regime over an indefinite time period and derive associated large-sample results. The proposed method can be used to estimate the optimal dynamic treatment regime in chronic disease settings. We illustrate this by simulating a dataset corresponding to a cohort of patients with diabetes that mimics the third wave of the National Health and Nutrition Examination Survey, and examining the performance of the proposed method in controlling the level of haemoglobin A1c.
Checkmate: Exploring Backward Induction among Chess Players
Although backward induction is a cornerstone of game theory, most laboratory experiments have found that agents are not able to successfully backward induct. We analyze the play of world-class chess players in the centipede game, which is ill-suited for testing backward induction, and in pure backward induction games—Race to 100 games. We find that chess players almost never play the backward induction equilibrium in the centipede game, but many properly backward induct in the Race to 100 games. We find no systematic within-subject relationship between choices in the centipede game and performance in pure backward induction games. (JEL C73)
Level-k thinking in the extensive form
Level- k thinking has been widely applied as a solution concept for games in normal form in behavioral and experimental game theory. We consider level-k thinking in games in extensive form. Player’s may learn about levels of opponents’ thinking during the play of the game because some information sets may be inconsistent with certain levels. In particular, for any information set reached, a level- k player attaches the maximum level-$$\\ell $$ℓ thinking for$$\\ell < k$$ℓ < k to her opponents consistent with the information set. We compare our notion of strong level- k thinking with other solution concepts such as level- k thinking in the associated normal form, strong rationalizability,$$\\Delta $$Δ -rationalizability, iterated admissibility, backward rationalizability, backward level- k thinking, and backward induction. We use strong level- k thinking to reanalyze data from some prior experiments in the literature.
Vulnerability and defence: A case for Stackelberg game dynamics
This paper examines the tactical interaction between drones and tanks in modern warfare through game theory, particularly focusing on Stackelberg equilibrium and backward induction. It describes a high-stakes conflict between two teams: one using advanced drones for attack, and the other defending using tanks. The paper conceptualizes this as a sequential game, illustrating the complex strategic dynamics similar to Stackelberg competition, where moves and countermoves are carefully analyzed and predicted.
TIME HORIZON AND COOPERATION IN CONTINUOUS TIME
We study social dilemmas in (quasi-) continuous-time experiments, comparing games with different durations and termination rules. We discover a stark qualitative contrast in behavior in continuous time as compared to previously studied behavior in discrete-time games: cooperation is easier to achieve and sustain with deterministic horizons than with stochastic ones, and end-game effects emerge, but subjects postpone them with experience. Analysis of individual strategies provides a basis for a simple reinforcement learning model that proves to be consistent with this evidence. An additional treatment lends further support to this explanation.
A Dynamic Level-k Model in Sequential Games
Backward induction is a widely accepted principle for predicting behavior in sequential games. In the classic example of the \"centipede game,\" however, players frequently violate this principle. An alternative is a \"dynamic level- k \" model, where players choose a rule from a rule hierarchy. The rule hierarchy is iteratively defined such that the level- k rule is a best response to the level- (k-1) rule, and the level- ∞ rule corresponds to backward induction. Players choose rules based on their best guesses of others' rules and use historical plays to improve their guesses. The model captures two systematic violations of backward induction in centipede games, limited induction and repetition unraveling. Because the dynamic level- k model always converges to backward induction over repetition, the former can be considered to be a tracing procedure for the latter. We also examine the generalizability of the dynamic level- k model by applying it to explain systematic violations of backward induction in sequential bargaining games. We show that the same model is capable of capturing these violations in two separate bargaining experiments. This paper was accepted by Peter Wakker, decision analysis.