|
# Model parameters training with `i-DQN` and `i-IQN` |
|
This repository contains the model parameters trained with `i-DQN` on [$57$ Atari games](#list-of-games-for-i-dqn) and trained with `i-IQN` on [$20$ Atari games](#list-of-games-for-i-iqn) ๐ฎ. $5$ seeds are available for each configuration which makes a total of $385$ available models ๐. |
|
|
|
The [evaluate.ipynb](./evaluate.ipynb) notebook contains a minimal example to evaluate to model parameters ๐งโ๐ซ. It uses JAX ๐. |
|
|
|
ps: The set of [$20$ Atari games](#list-of-games-for-i-iqn) is included in the set of [$57$ Atari games](#list-of-games-for-i-dqn). |
|
|
|
### Model performances |
|
`i-DQN` and `i-IQN` are improvements made over [`DQN`](https://www.nature.com/articles/nature14236.pdf) and [`IQN`](https://arxiv.org/abs/1806.06923) โจ. Check it out on [arXiv](https://arxiv.org/abs/2403.02107)! | <img src="performances.png" alt="drawing" width="600"/> |
|
:-:|:-: |
|
|
|
|
|
### List of games for `i-DQN` |
|
Alien, Amidar, Assault, Asterix, Asteroids, Atlantis, BankHeist, BattleZone, BeamRider, Berzerk, Bowling, Boxing, Breakout, Centipede, ChopperCommand, CrazyClimber, DemonAttack, DoubleDunk, Enduro, FishingDerby, Freeway, Frostbite, Gopher, Gravitar, Hero, IceHockey, Jamesbond, Kangaroo, Krull, KungFuMaster, MontezumaRevenge, MsPacman, NameThisGame, Phoenix, Pitfall, Pong, Pooyan, PrivateEye, Qbert, Riverraid, RoadRunner, Robotank, Seaquest, Skiing, Solaris, SpaceInvaders, StarGunner, Tennis, TimePilot, Tutankham, UpNDown, Venture, VideoPinball, WizardOfWor, YarsRevenge, Zaxxon. |
|
|
|
### List of games for `i-IQN` |
|
Alien, Assault, BankHeist, Berzerk, Breakout, Centipede, ChopperCommand, DemonAttack, Enduro, Frostbite, Gopher, Gravitar, IceHockey, Jamesbond, Krull, KungFuMaster, Riverraid, Seaquest, Skiing, StarGunner. |
|
|
|
## User installation |
|
Python 3.10 is recommended. Create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode: |
|
```bash |
|
python3.10 -m venv env |
|
source env/bin/activate |
|
pip install --upgrade pip |
|
pip install numpy==1.23.5 # to avoid numpy==2.XX |
|
pip install -r requirements.txt |
|
pip install --upgrade "jax[cuda12_pip]==0.4.13" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html |
|
``` |
|
|
|
## Citing `i-QN` |
|
``` |
|
@article{vincent2024iterated, |
|
title={Iterated $ Q $-Network: Beyond the One-Step Bellman Operator}, |
|
author={Vincent, Th{\'e}o and Palenicek, Daniel and Belousov, Boris and Peters, Jan and D'Eramo, Carlo}, |
|
journal={arXiv preprint arXiv:2403.02107}, |
|
year={2024} |
|
} |
|
``` |