metadata

pipeline_tag: reinforcement-learning
tags:
  - deep
  - reinforcement
  - learning
  - world
  - models
library_name: pytorch
license: gpl-3.0

M³: A Modular World Model over Streams of Tokens

📄 Paper ▪️ 💾 Code ▪️ 🧠 Trained Model Weights

M³ is a modular world model that extends the token-based world model framework to handle diverse observation and action modalities through independent, modality-specific components. It incorporates improvements from existing literature to enhance agent performance and achieves state-of-the-art sample efficiency for planning-free world models. It is the first method of this kind to reach a human-level median score on Atari 100K, exhibiting superhuman performance on 13 games. The model weights provided here cover Atari 100K, DeepMind Control Suite Proprioceptive 500K, and Craftax (Symbolic) 1M.

M3: A Modular World Model over Streams of Tokens

M³: A Modular World Model over Streams of Tokens