lunar_lander_v2_ppo_4 / results.json
bguan's picture
lunar lander model #4, using PPO trained with learning rate 0.0005 for 500K timesteps
0e6fc9b
raw
history blame contribute delete
165 Bytes
{"mean_reward": 249.37538823515524, "std_reward": 15.431378532153628, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2022-05-09T22:27:11.134823"}