Options
A Minimax-Bayes Approach to Ad Hoc Teamwork
Auteur(s)
Thomas Kleine Buening
Date de parution
2024
Notes
Learning policies for Ad Hoc Teamwork (AHT) is challenging. Most standard methods choose a specific distribution over training partners, which is assumed to mirror the distribution over partners after deployment. Moreover, they offer limited guarantees over worst-case performance. To tackle the issue, we propose using a worst-case prior distribution by adapting ideas from minimax-Bayes analysis to AHT.
We thereby explicitly account for our uncertainty about the partners at test time. Extensive experiments, including evaluations on coordination tasks from the Melting Pot suite, show our method's superior robustness compared to self-play, fictitious play, and best response learning w.r.t. policy populations. This highlights the importance of selecting an appropriate training distribution over teammates to achieve robustness in AHT.
We thereby explicitly account for our uncertainty about the partners at test time. Extensive experiments, including evaluations on coordination tasks from the Melting Pot suite, show our method's superior robustness compared to self-play, fictitious play, and best response learning w.r.t. policy populations. This highlights the importance of selecting an appropriate training distribution over teammates to achieve robustness in AHT.
Identifiants
Type de publication
conference paper
Dossier(s) à télécharger