--- pipeline_tag: reinforcement-learning tags: - ppo --- PPO agents trained in a selfplay settings. The agent were trained on observation as left player only. This repo include checkpoints collected during training for 4 experiments: - Shared weights for actor and critic - No shared weights - Resume training for extra steps for both shared and no shared setup