Papers
arxiv:2111.00213

Adjacency constraint for efficient hierarchical reinforcement learning

Published on Oct 30, 2021
Authors:
,
,
,

Abstract

Goal-conditioned Hierarchical Reinforcement Learning (HRL) is a promising approach for scaling up reinforcement learning (RL) techniques. However, it often suffers from training inefficiency as the action space of the high-level, i.e., the goal space, is large. Searching in a large goal space poses difficulty for both high-level subgoal generation and low-level policy learning. In this paper, we show that this problem can be effectively alleviated by restricting the high-level action space from the whole goal space to a k-step adjacent region of the current state using an adjacency constraint. We theoretically prove that in a deterministic Markov Decision Process (MDP), the proposed adjacency constraint preserves the optimal hierarchical policy, while in a stochastic MDP the adjacency constraint induces a bounded state-value suboptimality determined by the MDP's transition structure. We further show that this constraint can be practically implemented by training an adjacency network that can discriminate between adjacent and non-adjacent subgoals. Experimental results on discrete and continuous control tasks including challenging simulated robot locomotion and manipulation tasks show that incorporating the adjacency constraint significantly boosts the performance of state-of-the-art goal-conditioned HRL approaches.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2111.00213 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2111.00213 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2111.00213 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.