Environment Design for Inverse Reinforcement Learning

Thomas Kleine Buening; Dimitrakakis, Christos

doi:10.48550/arXiv.2210.14972

Environment Design for Inverse Reinforcement Learning

Auteur(s)

Thomas Kleine Buening

Dimitrakakis, Christos

Institut d'informatique

Date de parution

2022

In

Computing Research Repository (CoRR)

Vol.

2210.14972

Mots-clés

Résumé

The task of learning a reward function from expert demonstrations suffers from high sample complexity as well as inherent limitations to what can be learned from demonstrations in a given environment. As the samples used for reward learning require human input, which is generally expensive, much effort has been dedicated towards designing more sample-efficient algorithms. Moreover, even with abundant data, current methods can still fail to learn insightful reward functions that are robust to minor changes in the environment dynamics. We approach these challenges differently than prior work by improving the sample-efficiency as well as the robustness of learned rewards through adaptively designing a sequence of demonstration environments for the expert to act in. We formalise a framework for this environment design process in which learner and expert repeatedly interact, and construct algorithms that actively seek information about the rewards by carefully curating environments for the human to demonstrate the task in.

Identifiants

https://libra.unine.ch/handle/123456789/30965

_

10.48550/arXiv.2210.14972

Type de publication

journal article

Dossier(s) à télécharger

main article: 2210.14972.pdf (1.12 MB)

google-scholar

Options

Environment Design for Inverse Reinforcement Learning