File size: 807 Bytes
46bfa55 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
license: mit
tags:
- reinforcement-learning
- stable-baselines3
- mujoco
- ant-v4
- ppo
pipeline_tag: reinforcement-learning
library_name: stable-baselines3
model_name: PPO-Ant-v4
---
# PPO - Ant-v4 🌟
A Proximal Policy Optimization (PPO) agent trained with **stable-baselines3** on the MuJoCo **`Ant-v4`** environment.
| | Details |
|---|---|
| Environment | `gymnasium==0.29` & `mujoco==2.3` (`Ant-v4`) |
| Algorithm | PPO (`stable-baselines3==2.3.0`) |
| Timesteps | **100 000** |
| Policy | `MlpPolicy` *(2 × 64 hidden, tanh)* |
| Return (mean ± std) | ~ *964* |
| Seed | `0` |
## Hyper-parameters
```jsonc
{
"n_steps": 128,
"batch_size": 64,
"n_epochs": 20,
"gamma": 0.99,
"learning_rate": 3e-4,
"ent_coef": 0.0,
"clip_range": 0.2
}
|