Atari_i-QN / README.md
TheoVincent's picture
minimal example - code
4cb4fc3
|
raw
history blame
2.53 kB

Model parameters training with i-DQN and i-IQN

This repository contains the model parameters trained with i-DQN on $57$ Atari games and trained with i-IQN on $20$ Atari games 🎮. $5$ seeds are available for each configuration which makes a total of $385$ available models 📈.

The evaluate.ipynb notebook contains a minimal example to evaluate to model parameters 🧑‍🏫. It uses JAX 🚀.

ps: The set of $20$ Atari games is included in the set of $57$ Atari games.

Model performances

i-DQN and i-IQN are improvements made over DQN and IQN ✨. Check it out on arXiv! drawing

List of games for i-DQN

Alien, Amidar, Assault, Asterix, Asteroids, Atlantis, BankHeist, BattleZone, BeamRider, Berzerk, Bowling, Boxing, Breakout, Centipede, ChopperCommand, CrazyClimber, DemonAttack, DoubleDunk, Enduro, FishingDerby, Freeway, Frostbite, Gopher, Gravitar, Hero, IceHockey, Jamesbond, Kangaroo, Krull, KungFuMaster, MontezumaRevenge, MsPacman, NameThisGame, Phoenix, Pitfall, Pong, Pooyan, PrivateEye, Qbert, Riverraid, RoadRunner, Robotank, Seaquest, Skiing, Solaris, SpaceInvaders, StarGunner, Tennis, TimePilot, Tutankham, UpNDown, Venture, VideoPinball, WizardOfWor, YarsRevenge, Zaxxon.

List of games for i-IQN

Alien, Assault, BankHeist, Berzerk, Breakout, Centipede, ChopperCommand, DemonAttack, Enduro, Frostbite, Gopher, Gravitar, IceHockey, Jamesbond, Krull, KungFuMaster, Riverraid, Seaquest, Skiing, StarGunner.

User installation

Python 3.10 is recommended. Create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:

python3.10 -m venv env
source env/bin/activate
pip install --upgrade pip
pip install numpy==1.23.5  # to avoid numpy==2.XX
pip install -r requirements.txt
pip install --upgrade "jax[cuda12_pip]==0.4.13" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

Citing i-QN

@article{vincent2024iterated,
  title={Iterated $ Q $-Network: Beyond the One-Step Bellman Operator},
  author={Vincent, Th{\'e}o and Palenicek, Daniel and Belousov, Boris and Peters, Jan and D'Eramo, Carlo},
  journal={arXiv preprint arXiv:2403.02107},
  year={2024}
}