Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?

Debabrota Basu; Dimitrakakis, Christos; Aristide Tossou

doi:10.48550/arXiv.1905.12298

Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?

Auteur(s)

Debabrota Basu

Dimitrakakis, Christos

Institut d'informatique

Aristide Tossou

Date de parution

2019

In

Computing Research Repository (CoRR)

Vol.

1905.12298

Mots-clés

Résumé

Based on differential privacy (DP) framework, we introduce and unify privacy definitions for the multi-armed bandit algorithms. We represent the framework with a unified graphical model and use it to connect privacy definitions. We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions. We leverage a unified proving technique to achieve all the lower bounds. We show that for all of them, the learner's regret is increased by a multiplicative factor dependent on the privacy level ϵ. We observe that the dependency is weaker when we do not require local differential privacy for the rewards.

Identifiants

https://libra.unine.ch/handle/123456789/30971

_

10.48550/arXiv.1905.12298

_

arXiv:1905.12298v2

Type de publication

journal article

Dossier(s) à télécharger

main article: 1905.12298.pdf (306.07 KB)

google-scholar

Options

Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?