metadata
tags:
- CartPole-v1
- reinforce
- reinforcement-learning
- custom-implementation
- deep-rl-class
model-index:
- name: Reinforce-CartPole-v1
results:
- task:
type: reinforcement-learning
name: reinforcement-learning
dataset:
name: CartPole-v1
type: CartPole-v1
metrics:
- type: mean_reward
value: 500.00 +/- 0.00
name: mean_reward
verified: false
Reinforce Agent playing CartPole-v1
This is a trained model of a Reinforce agent playing CartPole-v1 . To learn to use this model and train yours check Unit 4 of the Deep Reinforcement Learning Course: https://huggingface.co/deep-rl-course/unit4/introduction
to train a great model, you need to modify the hyperparameters
cartpole_hyperparameters = {
"h_size": 64,
"n_training_episodes": 2000,
"n_evaluation_episodes": 20,
"max_t": 1000,
"gamma": 0.99,
"lr": 1e-3,
"env_id": env_id,
"state_space": s_size,
"action_space": a_size,
}
Score Record:
Episode 100 Average Score: 29.39
Episode 200 Average Score: 40.43
Episode 300 Average Score: 62.50
Episode 400 Average Score: 140.69
Episode 500 Average Score: 257.97
Episode 600 Average Score: 385.96
Episode 700 Average Score: 444.55
Episode 800 Average Score: 471.07
Episode 900 Average Score: 425.36
Episode 1000 Average Score: 469.43
Episode 1100 Average Score: 482.73
Episode 1200 Average Score: 479.17
Episode 1300 Average Score: 492.68
Episode 1400 Average Score: 487.52
Episode 1500 Average Score: 485.91
Episode 1600 Average Score: 487.56
Episode 1700 Average Score: 485.40
Episode 1800 Average Score: 494.59
Episode 1900 Average Score: 488.71
Episode 2000 Average Score: 493.33
Episode 2100 Average Score: 496.70
Episode 2200 Average Score: 498.07
Episode 2300 Average Score: 498.38
Episode 2400 Average Score: 476.29
Episode 2500 Average Score: 485.02
Episode 2600 Average Score: 481.23
Episode 2700 Average Score: 498.21
Episode 2800 Average Score: 500.00
Episode 2900 Average Score: 496.20
Episode 3000 Average Score: 494.15