CoreyMorris
/

ppo-Pixelcopter-PLE-v0

Reinforcement Learning

stable-baselines3

Pixelcopter-PLE-v0

deep-reinforcement-learning

Model card Files Files and versions Community

ppo-Pixelcopter-PLE-v0 / config.json

Commit History

SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating

28a0b97

CoreyMorris commited on Jan 13, 2023