Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Data augmentation for efficient learning from parametric experts
by
Galashov, Alexandre
, Heess, Nicolas
, Merel, Josh
in
Algorithms
/ Cloning
/ Data augmentation
/ Distillation
/ Learning
2022
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Data augmentation for efficient learning from parametric experts
by
Galashov, Alexandre
, Heess, Nicolas
, Merel, Josh
in
Algorithms
/ Cloning
/ Data augmentation
/ Distillation
/ Learning
2022
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Data augmentation for efficient learning from parametric experts
Paper
Data augmentation for efficient learning from parametric experts
2022
Request Book From Autostore
and Choose the Collection Method
Overview
We present a simple, yet powerful data-augmentation technique to enable data-efficient learning from parametric experts for reinforcement and imitation learning. We focus on what we call the policy cloning setting, in which we use online or offline queries of an expert or expert policy to inform the behavior of a student policy. This setting arises naturally in a number of problems, for instance as variants of behavior cloning, or as a component of other algorithms such as DAGGER, policy distillation or KL-regularized RL. Our approach, augmented policy cloning (APC), uses synthetic states to induce feedback-sensitivity in a region around sampled trajectories, thus dramatically reducing the environment interactions required for successful cloning of the expert. We achieve highly data-efficient transfer of behavior from an expert to a student policy for high-degrees-of-freedom control problems. We demonstrate the benefit of our method in the context of several existing and widely used algorithms that include policy cloning as a constituent part. Moreover, we highlight the benefits of our approach in two practically relevant settings (a) expert compression, i.e. transfer to a student with fewer parameters; and (b) transfer from privileged experts, i.e. where the expert has a different observation space than the student, usually including access to privileged information.
Publisher
Cornell University Library, arXiv.org
Subject
MBRLCatalogueRelatedBooks
Related Items
Related Items
This website uses cookies to ensure you get the best experience on our website.