metadata
license: mit
tags:
- reinforcement-learning
- stable-baselines3
- mujoco
- ant-v4
- ppo
pipeline_tag: reinforcement-learning
library_name: stable-baselines3
model_name: PPO-Ant-v4
PPO - Ant-v4 ๐
A Proximal Policy Optimization (PPO) agent trained with stable-baselines3 on the MuJoCo Ant-v4
environment.
Details | |
---|---|
Environment | gymnasium==0.29 & mujoco==2.3 (Ant-v4 ) |
Algorithm | PPO (stable-baselines3==2.3.0 ) |
Timesteps | 100 000 |
Policy | MlpPolicy (2 ร 64 hidden, tanh) |
Return (mean ยฑ std) | ~ 964 |
Seed | 0 |
Hyper-parameters
{
"n_steps": 128,
"batch_size": 64,
"n_epochs": 20,
"gamma": 0.99,
"learning_rate": 3e-4,
"ent_coef": 0.0,
"clip_range": 0.2
}