ppo-LunarLander-v2 / results.json
dan
Baseline of PPO @ 512k iterations
10b1b7d
raw
history blame contribute delete
164 Bytes
{"mean_reward": 243.42799649727232, "std_reward": 22.54883789340839, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2022-05-10T08:56:59.052129"}