Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
95 result(s) for "Elie, Romuald"
Sort by:
Multiagent off-screen behavior prediction in football
In multiagent worlds, several decision-making individuals interact while adhering to the dynamics constraints imposed by the environment. These interactions, combined with the potential stochasticity of the agents’ dynamic behaviors, make such systems complex and interesting to study from a decision-making perspective. Significant research has been conducted on learning models for forward-direction estimation of agent behaviors, for example, pedestrian predictions used for collision-avoidance in self-driving cars. In many settings, only sporadic observations of agents may be available in a given trajectory sequence. In football, subsets of players may come in and out of view of broadcast video footage, while unobserved players continue to interact off-screen. In this paper, we study the problem of multiagent time-series imputation in the context of human football play, where available past and future observations of subsets of agents are used to estimate missing observations for other agents. Our approach, called the Graph Imputer , uses past and future information in combination with graph networks and variational autoencoders to enable learning of a distribution of imputed trajectories. We demonstrate our approach on multiagent settings involving players that are partially-observable, using the Graph Imputer to predict the behaviors of off-screen players. To quantitatively evaluate the approach, we conduct experiments on football matches with ground truth trajectory data, using a camera module to simulate the off-screen player state estimation setting. We subsequently use our approach for downstream football analytics under partial observability using the well-established framework of pitch control, which traditionally relies on fully observed data. We illustrate that our method outperforms several state-of-the-art approaches, including those hand-crafted for football, across all considered metrics.
TacticAI: an AI assistant for football tactics
Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research challenge. To address this unmet need, we propose TacticAI, an AI football tactics assistant developed and evaluated in close collaboration with domain experts from Liverpool FC. We focus on analysing corner kicks, as they offer coaches the most direct opportunities for interventions and improvements. TacticAI incorporates both a predictive and a generative component, allowing the coaches to effectively sample and explore alternative player setups for each corner kick routine and to select those with the highest predicted likelihood of success. We validate TacticAI on a number of relevant benchmark tasks: predicting receivers and shot attempts and recommending player position adjustments. The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC. We show that TacticAI’s model suggestions are not only indistinguishable from real tactics, but also favoured over existing tactics 90% of the time, and that TacticAI offers an effective corner kick retrieval system. TacticAI achieves these results despite the limited availability of gold-standard data, achieving data efficiency through geometric deep learning. In modern football games, data-driven analysis serves as a key driver in determining tactics. Wang, Veličković, Hennes et al. develop a geometric deep learning algorithm, named TacticAI, to solve high-dimensional learning tasks over corner kicks and suggest tactics favoured over existing ones 90% of the time.
Game Plan: What AI can do for Football, and What Football can do for AI
The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with the goal of better addressing new scientific challenges involved in the analysis of both individual players’ and coordinated teams’ behaviors. The research challenges associated with predictive and prescriptive football analytics require new developments and progress at the intersection of statistical learning, game theory, and computer vision. In this paper, we provide an overarching perspective highlighting how the combination of these fields, in particular, forms a unique microcosm for AI research, while offering mutual benefits for professional teams, spectators, and broadcasters in the years to come. We illustrate that this duality makes football analytics a game changer of tremendous value, in terms of not only changing the game of football itself, but also in terms of what this domain can mean for the field of AI. We review the state-of-the-art and exemplify the types of analysis enabled by combining the aforementioned fields, including illustrative examples of counterfactual analysis using predictive models, and the combination of game-theoretic analysis of penalty kicks with statistical learning of player attributes. We conclude by highlighting envisioned downstream impacts, including possibilities for extensions to other sports (real and virtual).
Deep reinforcement learning can promote sustainable human behaviour in a common-pool resource problem
A canonical social dilemma arises when resources are allocated to people, who can either reciprocate with interest or keep the proceeds. The right resource allocation mechanisms can encourage levels of reciprocation that sustain the commons. Here, in an iterated multiplayer trust game, we use deep reinforcement learning (RL) to design a social planner that promotes sustainable contributions from human participants. We first trained neural networks to behave like human players, creating a stimulated economy that allows us to study the dynamics of receipt and reciprocation. We use RL to train a mechanism to maximise aggregate return to players. The RL mechanism discovers a redistributive policy that leads to a large but also more equal surplus. The mechanism outperforms baseline mechanisms by conditioning its generosity on available resources and temporarily sanctioning defectors. Examining the RL policy allows us to develop a similar but explainable mechanism that is more popular among players. Koster et al introduce a deep reinforcement learning (RL) mechanism designed to manage common-pool resources successfully encourages sustainable cooperation among human participants by dynamically adjusting resource allocations based on the current state of the resource pool. The RL-derived policy outperforms traditional allocation methods by balancing generosity when resources are abundant and applying temporary sanctions to discourage free-riding, ultimately maximizing social welfare and fairness.
Reinforcement Learning in Economics and Finance
Reinforcement learning algorithms describe how an agent can learn an optimal action policy in a sequential decision process, through repeated experience. In a given environment, the agent policy provides him some running and terminal rewards. As in online learning, the agent learns sequentially. As in multi-armed bandit problems, when an agent picks an action, he can not infer ex-post the rewards induced by other action choices. In reinforcement learning, his actions have consequences: they influence not only rewards, but also future states of the world. The goal of reinforcement learning is to find an optimal policy – a mapping from the states of the world to the set of actions, in order to maximize cumulative reward, which is a long term strategy. Exploring might be sub-optimal on a short-term horizon but could lead to optimal long-term ones. Many problems of optimal control, popular in economics for more than forty years, can be expressed in the reinforcement learning framework, and recent advances in computational science, provided in particular by deep learning algorithms, can be used by economists in order to solve complex behavioral problems. In this article, we propose a state-of-the-art of reinforcement learning techniques, and present applications in economics, game theory, operation research and finance.
BSDES WITH MEAN REFLECTION
In this paper, we study a new type of BSDE, where the distribution of the Y-component of the solution is required to satisfy an additional constraint, written in terms of the expectation of a loss function. This constraint is imposed at any deterministic time t and is typically weaker than the classical pointwise one associated to reflected BSDEs. Focusing on solutions (Y, Z, K) with deterministic K, we obtain the well-posedness of such equation, in the presence of a natural Skorokhod-type condition. Such condition indeed ensures the minimality of the enhanced solution, under an additional structural condition on the driver. Our results extend to the more general framework where the constraint is written in terms of a static risk measure on Y. In particular, we provide an application to the super-hedging of claims under running risk management constraint.
Optimal claiming strategies in Bonus Malus systems and implied Markov chains
In this paper, we investigate the impact of the accident reporting strategy of drivers, within a Bonus-Malus system. We exhibit the induced modification of the corresponding class level transition matrix and derive the optimal reporting strategy for rational drivers. The hunger for bonuses induces optimal thresholds under which, drivers do not claim their losses. Mathematical properties of the induced level class process are studied. A convergent numerical algorithm is provided for computing such thresholds and realistic numerical applications are discussed.
Double Kernel Estimation of Sensitivities
In this paper we address the general issue of estimating the sensitivity of the expectation of a random variable with respect to a parameter characterizing its evolution. In finance, for example, the sensitivities of the price of a contingent claim are called the Greeks. A new way of estimating the Greeks has recently been introduced in Elie, Fermanian and Touzi (2007) through a randomization of the parameter of interest combined with nonparametric estimation techniques. In this paper we study another type of estimator that turns out to be closely related to the score function, which is well known to be the optimal Greek weight. This estimator relies on the use of two distinct kernel functions and the main interest of this paper is to provide its asymptotic properties. Under a slightly more stringent condition, its rate of convergence is the same as the one of the estimator introduced in Elie, Fermanian and Touzi (2007) and outperforms the finite differences estimator. In addition to the technical interest of the proofs, this result is very encouraging in the dynamic of creating new types of estimator for the sensitivities.
A Tale of a Principal and Many, Many Agents
In this paper, we investigate a moral hazard problem in finite time with lump-sum and continuous payments, involving infinitely many agents with mean-field type interactions, hired by one principal. By reinterpreting the mean-field game faced by each agent in terms of a mean-field forward-backward stochastic differential equation (FBSDE), we are able to rewrite the principal’s problem as a control problem of the McKean-Vlasov stochastic differential equations. We review one general approach to tackling it, introduced recently using dynamic programming and Hamilton-Jacobi-Bellman (HJB for short) equations, and mention a second one based on the stochastic Pontryagin maximum principle. We solve completely and explicitly the problem in special cases, going beyond the usual linear-quadratic framework. We finally show in our examples that the optimal contract in the N -players’ model converges to the mean-field optimal contract when the number of agents goes to +∞.
Optimal lifetime consumption and investment under a drawdown constraint
We consider the infinite-horizon optimal consumption-investment problem under a drawdown constraint, i.e., when the wealth process never falls below a fixed fraction of its running maximum. We assume that the risky asset is driven by the with constant coefficients. For a general class of utility functions, we provide the value function in explicit form and derive closed-form expressions for the optimal consumption and investment strategy.