arxiv:2308.09873

Skill Transformer: A Monolithic Policy for Mobile Manipulation

Published on Aug 19, 2023

Authors:

Abstract

We present Skill Transformer, an approach for solving long-horizon robotic tasks by combining conditional sequence modeling and skill modularity. Conditioned on egocentric and proprioceptive observations of a robot, Skill Transformer is trained end-to-end to predict both a high-level skill (e.g., navigation, picking, placing), and a whole-body low-level action (e.g., base and arm motion), using a transformer architecture and demonstration trajectories that solve the full task. It retains the composability and modularity of the overall task through a skill predictor module while reasoning about low-level actions and avoiding hand-off errors, common in modular approaches. We test Skill Transformer on an embodied rearrangement benchmark and find it performs robust task planning and low-level control in new scenarios, achieving a 2.5x higher success rate than baselines in hard rearrangement problems.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2308.09873 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2308.09873 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2308.09873 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.