Papers
arxiv:2012.14755

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs

Published on Dec 29, 2020
Authors:
,
,

Abstract

A novel model-based algorithm, DisCo, efficiently learns $ε$-optimal goal-conditioned policies in an unknown environment without a reward function, improving sample complexity and handling cost-sensitive shortest-path problems.

AI-generated summary

We investigate the exploration of an unknown environment when no reward function is provided. Building on the incremental exploration setting introduced by Lim and Auer [1], we define the objective of learning the set of ε-optimal goal-conditioned policies attaining all states that are incrementally reachable within L steps (in expectation) from a reference state s_0. In this paper, we introduce a novel model-based approach that interleaves discovering new states from s_0 and improving the accuracy of a model estimate that is used to compute goal-conditioned policies to reach newly discovered states. The resulting algorithm, DisCo, achieves a sample complexity scaling as O(L^5 S_{L+ε} Γ_{L+ε} A ε^{-2}), where A is the number of actions, S_{L+ε} is the number of states that are incrementally reachable from s_0 in L+ε steps, and Γ_{L+ε} is the branching factor of the dynamics over such states. This improves over the algorithm proposed in [1] in both ε and L at the cost of an extra Γ_{L+ε} factor, which is small in most environments of interest. Furthermore, DisCo is the first algorithm that can return an ε/c_{min}-optimal policy for any cost-sensitive shortest-path problem defined on the L-reachable states with minimum cost c_{min}. Finally, we report preliminary empirical results confirming our theoretical findings.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2012.14755 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2012.14755 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.