Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

Advances in Online Convex Optimization, Games, and Problems with Bandit Feedback

by Cardoso, Adrian Rivera

in Convex analysis / Decision making / Distance learning / Educational technology / Gaming machines / Mathematics / Privacy

2020

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Dissertation

Advances in Online Convex Optimization, Games, and Problems with Bandit Feedback

Cardoso, Adrian Rivera

2020

Overview

In this thesis we study sequential decision making through the lens of Online Learn- ing. Online Learning is a very powerful and general framework for multi-period decision making. Due to its simple farmulation and effectiveness it has become a tool of daily use in multibillion companies. Moreover, duc to its beautiful theory and its tight connections with other fields, Online Learning has caught the attention of academics all over the world and driven first-class research.In the first chapter of this thesis. joint work with Huan Xu, we study a problem called: Risk-Averse Convex Bandit. Risk-aversion mukes reference to the fact that humans prefer consistent sequences of good rewards instead of highly variable sequences with slightly better rewards. The Risk-Averse Convex Bandit addresses the fact that, while hisman deci- sion makers are risk-averse. most algorithms for Online Learning are not. In this thesis we provide the first efficient algorithms with strong theoretical guarantees for the Risk-Averse Convex Bandit problem.In the second chapter. joint work with Rachel Cummings, we study the problem of pre- serving privacy in the setting of online submodular minimization. Submodular functions have multiple applications in machine leaming and economics. which usually involve sen- sitive data from individuals. Using tools from Online Convex Optimization, we provide the first «-differentially private algorithms for this problem which are almost as good as the non-private versions for this problem.In the third chapter, joint work with Jacob Abernethy, He Wang. and Huan Xu. we study a dynamic version of two player zero-sum games. Zeto-sum games are ubiquitous in economics, and central to understanding Linear Programming Duality, Convex and Robust Optimization. and Statistics. For many decades it was thought that one could solve this kind of games using sublinear regret algorithms for Online Convex Optimization. We show that while the previous is tue when the game docs not change with time, a naive application of these algorithms can be fatal if the game changes and the players are trying to compete with the Nash Equilibrium of the sum of the games in hindsight.In the fourth chapter. joint work with He Wang and Huan Xu. we revisit the decade old problem of Markov Decision Processes (MDPs) with Adversurial Rewards. MDPs provide a genera) mathematical framework for sequemial decision making under uncertainty when there is a notion of ‘state’, moreover they are the backbone of all Reinforcemem Leaming. We provide an elegant algorithm for this problem using tools fram Online Convex Opti- mization. The algorithm's performance is comparable with cusrent state of the art. We also consider the problem under the large state-space reginw:. and provide the first algorithm with strong theoretical guarantees.

Share this book

Add to My Shelf

Publisher

ProQuest Dissertations & Theses

Subject

Convex analysis

/ Decision making

/ Distance learning

/ Educational technology

/ Gaming machines

/ Mathematics

/ Privacy