--- pipeline_tag: reinforcement-learning tags: - deep - reinforcement - learning - world - models library_name: pytorch license: gpl-3.0 --- # M3: A Modular World Model over Streams of Tokens 📄 [Paper](https://arxiv.org/abs/2502.11537) ▪️ 💾 [Code](https://github.com/leor-c/M3) ▪️ 🧠 [Trained Model Weights](https://huggingface.co/leorc/M3) M3 is a modular world model that extends the token-based world model framework to handle diverse observation and action modalities through independent, modality-specific components. It incorporates improvements from existing literature to enhance agent performance and achieves state-of-the-art sample efficiency for planning-free world models. It is the first method of this kind to reach a human-level median score on Atari 100K, exhibiting superhuman performance on 13 games. The model weights provided here cover Atari 100K, DeepMind Control Suite Proprioceptive 500K, and Craftax (Symbolic) 1M.