Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration - Statistics > Machine LearningReport as inadecuate




Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration - Statistics > Machine Learning - Download this document for free, or read online. Document in PDF available to download.

Abstract: Several approximate policy iteration schemes without value functions, whichfocus on policy representation using classifiers and address policy learning asa supervised learning problem, have been proposed recently. Finding goodpolicies with such methods requires not only an appropriate classifier, butalso reliable examples of best actions, covering the state space sufficiently.Up to this time, little work has been done on appropriate covering schemes andon methods for reducing the sample complexity of such methods, especially incontinuous state spaces. This paper focuses on the simplest possible coveringscheme a discretized grid over the state space and performs asample-complexity comparison between the simplest and previously commonlyused rollout sampling allocation strategy, which allocates samples equally ateach state under consideration, and an almost as simple method, which allocatessamples only as needed and requires significantly fewer samples.



Author: Christos Dimitrakakis, Michail G. Lagoudakis

Source: https://arxiv.org/







Related documents