Repository logo
Research Data
Publications
Projects
Persons
Organizations
English
Français
Log In(current)
  1. Home
  2. Publications
  3. Article de recherche (journal article)
  4. Randomised Bayesian Least-Squares Policy Iteration

Randomised Bayesian Least-Squares Policy Iteration

Author(s)
Nikolaos Tziortziotis
Dimitrakakis, Christos  
Chaire de science des données  
Michalis Vazirgiannis
Date issued
April 6, 2019
In
Computing Research Repository (CoRR)
Vol
1904.03535
Subjects
cs.LG cs.AI stat.ML
Abstract
We introduce Bayesian least-squares policy iteration (BLSPI), an off-policy, model-free, policy iteration algorithm that uses the Bayesian least-squares temporal-difference (BLSTD) learning algorithm to evaluate policies. An online variant of BLSPI has been also proposed, called randomised BLSPI (RBLSPI), that improves its policy based on an incomplete policy evaluation step. In online setting, the exploration-exploitation dilemma should be addressed as we try to discover the optimal policy by using samples collected by ourselves. RBLSPI exploits the advantage of BLSTD to quantify our uncertainty about the value function. Inspired by Thompson sampling, RBLSPI first samples a value function from a posterior distribution over value functions, and then selects actions based on the sampled value function. The effectiveness and the exploration abilities of RBLSPI are demonstrated experimentally in several environments.
Publication type
journal article
Identifiers
https://libra.unine.ch/handle/20.500.14713/64469
DOI
10.48550/arXiv.1904.03535
-
1904.03535v1
File(s)
Loading...
Thumbnail Image
Download
Name

1904.03535.pdf

Type

Main Article

Size

7.63 MB

Format

Adobe PDF

Université de Neuchâtel logo

Service information scientifique & bibliothèques

Rue Emile-Argand 11

2000 Neuchâtel

contact.libra@unine.ch

Service informatique et télématique

Rue Emile-Argand 11

Bâtiment B, rez-de-chaussée

Powered by DSpace-CRIS

libra v2.1.0

© 2026 Université de Neuchâtel

Portal overviewUser guideOpen Access strategyOpen Access directive Research at UniNE Open Access ORCIDWhat's new