model card enhancement

Browse files

Files changed (4) hide show

.gitattributes +1 -0
README.md +8 -19
atari.gif +3 -0
config.json +32 -0

.gitattributes CHANGED Viewed

@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 *best_online_params filter=lfs diff=lfs merge=lfs -text

 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 *best_online_params filter=lfs diff=lfs merge=lfs -text
+*.gif filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,35 +1,24 @@
 ---
 license: mit
 license_link: https://huggingface.co/TheoVincent/Atari_i-QN/blob/main/LICENSE
-language:
-- en
 tags:
   - reinforcement-learning
   - jax
-  - eval-results
-  - deep-reinforcement-learning
   - atari
-  - dqn
-  - iqn
 ---
-# Model parameters training with `i-DQN` and `i-IQN`
-This repository contains the model parameters trained with `i-DQN` on [$56$ Atari games](#list-of-games-for-i-dqn) and trained with `i-IQN` on [$20$ Atari games](#list-of-games-for-i-iqn) 🎮. $5$ seeds are available for each configuration which makes a total of $380$ available models 📈.
-The [evaluate.ipynb](./evaluate.ipynb) notebook contains a minimal example to evaluate to model parameters 🧑‍🏫. It uses JAX 🚀.
-ps: The set of [$20$ Atari games](#list-of-games-for-i-iqn) is included in the set of [$56$ Atari games](#list-of-games-for-i-dqn).
 ### Model performances
-`i-DQN` and `i-IQN` are improvements made over [`DQN`](https://www.nature.com/articles/nature14236.pdf) and [`IQN`](https://arxiv.org/abs/1806.06923) ✨. Check it out on [arXiv](https://arxiv.org/abs/2403.02107)! | <img src="performances.png" alt="drawing" width="600"/>
-:-:|:-:
-### List of games for `i-DQN`
-Alien, Amidar, Assault, Asterix, Asteroids, Atlantis, BankHeist, BattleZone, BeamRider, Berzerk, Bowling, Boxing, Breakout, Centipede, ChopperCommand, CrazyClimber, DemonAttack, DoubleDunk, Enduro, FishingDerby, Freeway, Frostbite, Gopher, Gravitar, Hero, IceHockey, Jamesbond, Kangaroo, Krull, KungFuMaster, MontezumaRevenge, MsPacman, NameThisGame, Phoenix, Pitfall, Pong, Pooyan, PrivateEye, Qbert, Riverraid, RoadRunner, Robotank, Seaquest, Skiing, Solaris, SpaceInvaders, StarGunner, Tennis, TimePilot, Tutankham, UpNDown, Venture, VideoPinball, WizardOfWor, YarsRevenge, Zaxxon.
-### List of games for `i-IQN`
-Alien, Assault, BankHeist, Berzerk, Breakout, Centipede, ChopperCommand, DemonAttack, Enduro, Frostbite, Gopher, Gravitar, IceHockey, Jamesbond, Krull, KungFuMaster, Riverraid, Seaquest, Skiing, StarGunner.
 ## User installation
 Python 3.10 is recommended. Create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:

 ---
 license: mit
 license_link: https://huggingface.co/TheoVincent/Atari_i-QN/blob/main/LICENSE
 tags:
   - reinforcement-learning
   - jax
   - atari
+co2_eq_emissions:
+  emissions: 3000
 ---
+# Model parameters trained with `i-DQN` and `i-IQN`
+This repository contains the model parameters trained with `i-DQN` on [56 Atari games](#i-DQN_games) and trained with `i-IQN` on [20 Atari games](#i-IQN_games) 🎮. 5 seeds are available for each configuration which makes a total of 380 available models 📈.
+The [evaluate.ipynb](./evaluate.ipynb) notebook contains a minimal example to evaluate to model parameters 🧑‍🏫. It uses JAX 🚀. The hyperparameters used during training are reported in [config.json](./config.json) 🔧.
+ps: The set of [20 Atari games](#i-DQN_games) is included in the set of [56 Atari games](#i-IQN_games).
 ### Model performances
+| `i-DQN` and `i-IQN` are improvements made over [`DQN`](https://www.nature.com/articles/nature14236.pdf) and [`IQN`](https://arxiv.org/abs/1806.06923) ✨. Check the paper on [arXiv](https://arxiv.org/abs/2403.02107)! <details> <summary id=i-DQN_games>List of games trained with `i-DQN`</summary> *Alien, Amidar, Assault, Asterix, Asteroids, Atlantis, BankHeist, BattleZone, BeamRider, Berzerk, Bowling, Boxing, Breakout, Centipede, ChopperCommand, CrazyClimber, DemonAttack, DoubleDunk, Enduro, FishingDerby, Freeway, Frostbite, Gopher, Gravitar, Hero, IceHockey, Jamesbond, Kangaroo, Krull, KungFuMaster, MontezumaRevenge, MsPacman, NameThisGame, Phoenix, Pitfall, Pong, Pooyan, PrivateEye, Qbert, Riverraid, RoadRunner, Robotank, Seaquest, Skiing, Solaris, SpaceInvaders, StarGunner, Tennis, TimePilot, Tutankham, UpNDown, Venture, VideoPinball, WizardOfWor, YarsRevenge, Zaxxon.* </details> <details> <summary id=i-IQN_games>List of games trained with `i-IQN`</summary> *Alien, Assault, BankHeist, Berzerk, Breakout, Centipede, ChopperCommand, DemonAttack, Enduro, Frostbite, Gopher, Gravitar, IceHockey, Jamesbond, Krull, KungFuMaster, Riverraid, Seaquest, Skiing, StarGunner.* </details> | <img src="performances.png" alt="drawing" width="600"/> |
+| :-: | :-: |
 ## User installation
 Python 3.10 is recommended. Create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:

atari.gif ADDED Viewed

Git LFS Details

SHA256: 9084236da60018fb7538ed8b5a1a65c76a22cc8fea4a2423c639c3d81ca71760
Pointer size: 131 Bytes
Size of remote file: 690 kB

config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+    "---- Shared parameters ---": "----------------",
+    "gamma": 0.99,
+    "replay_buffer_size": 1000000,
+    "n_initial_samples": 20000,
+    "n_epochs": 200,
+    "n_training_steps_per_epoch": 250000,
+    "n_training_steps_per_online_update": 4,
+    "horizon": 27000,
+    "starting_eps": 1,
+    "ending_eps": 0.01,
+    "duration_eps": 250000,
+    "batch_size": 32,
+    "n_step_return": 5,
+    "---- i-DQN ---": "----------------------------",
+    "idqn_learning_rate": 6.25e-5,
+    "idqn_optimizer_eps": 1.5e-4,
+    "idqn_n_training_steps_per_target_update": 3000000,
+    "idqn_n_training_steps_per_rolling_step": 8000,
+    "idqn_head_behaviorial_policy": "uniform",
+    "idqn_shared_network": true,
+    "---- i-IQN ---": "----------------------------",
+    "iiqn_learning_rate": 0.00005,
+    "iiqn_optimizer_eps": 0.0003125,
+    "iiqn_n_training_steps_per_target_update": 3000000,
+    "iiqn_n_training_steps_per_rolling_step": 8000,
+    "iiqn_head_behaviorial_policy": "uniform",
+    "iiqn_n_quantiles_policy": 32,
+    "iiqn_n_quantiles": 64,
+    "iiqn_n_quantiles_target": 64,
+    "iiqn_shared_network": true
+}