DQN Agent playing MountainCar-v0

This is a trained model of a DQN agent playing MountainCar-v0. We train a three-layer MLP as the Q-network. We store the transitions in a replay buffer. After the network converges, we stop training and validate its performance in comparison to a random baseline.

Parameters:

hidden_size = 64
gamma = 0.99
epsilon_decay = 0.999
buffer_size = 10000
batch_size = 64
episodes = 10000
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results