File size: 3,113 Bytes
2c407e0
 
 
 
 
 
 
acafacf
964cb25
2c407e0
 
acafacf
caa963d
4cb4fc3
caa963d
4cb4fc3
caa963d
7581d93
acafacf
4cb4fc3
 
caa963d
acafacf
4cb4fc3
 
 
 
 
 
 
 
 
 
 
 
3e8854b
4cb4fc3
 
 
 
f3a02c7
 
4cb4fc3
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
license: mit
license_link: https://huggingface.co/TheoVincent/Atari_i-QN/blob/main/LICENSE
tags:
  - reinforcement-learning
  - jax
  - atari
co2_eq_emissions:
  emissions: 3000000
---

# Model parameters trained with `i-DQN` and `i-IQN`
This repository contains the model parameters trained with `i-DQN` on [56 Atari games](#i-DQN_games) and trained with `i-IQN` on [20 Atari games](#i-IQN_games) ๐ŸŽฎ 5 seeds are available for each configuration which makes a total of **380 available models** ๐Ÿ“ˆ

The [evaluate.ipynb](./evaluate.ipynb) notebook contains a minimal example to evaluate to model parameters ๐Ÿง‘โ€๐Ÿซ It uses JAX ๐Ÿš€ The hyperparameters used during training are reported in [config.json](./config.json) ๐Ÿ”ง

To the training code ๐Ÿ‘‰[๐Ÿ’ป](https://github.com/theovincent/i-DQN)

ps: The set of [20 Atari games](#i-DQN_games) is included in the set of [56 Atari games](#i-IQN_games).

### Model performances
| <div style="width:300px; font-size: 30px; font-family:Serif; font-name:Times New Roman" > **i-DQN** and **i-IQN** are improvements of [DQN](https://www.nature.com/articles/nature14236.pdf) and [IQN](https://arxiv.org/abs/1806.06923). <br>  Published at [TMLR](https://arxiv.org/abs/2403.02107)โœจ </br> <div style="font-size: 16px"> <details> <summary id=i-DQN_games>List of games trained with `i-DQN` </summary> *Alien, Amidar, Assault, Asterix, Asteroids, Atlantis, BankHeist, BattleZone, BeamRider, Berzerk, Bowling, Boxing, Breakout, Centipede, ChopperCommand, CrazyClimber, DemonAttack, DoubleDunk, Enduro, FishingDerby, Freeway, Frostbite, Gopher, Gravitar, Hero, IceHockey, Jamesbond, Kangaroo, Krull, KungFuMaster, MontezumaRevenge, MsPacman, NameThisGame, Phoenix, Pitfall, Pong, Pooyan, PrivateEye, Qbert, Riverraid, RoadRunner, Robotank, Seaquest, Skiing, Solaris, SpaceInvaders, StarGunner, Tennis, TimePilot, Tutankham, UpNDown, Venture, VideoPinball, WizardOfWor, YarsRevenge, Zaxxon.* </details> <details> <summary id=i-IQN_games>List of games trained with `i-IQN`</summary> *Alien, Assault, BankHeist, Berzerk, Breakout, Centipede, ChopperCommand, DemonAttack, Enduro, Frostbite, Gopher, Gravitar, IceHockey, Jamesbond, Krull, KungFuMaster, Riverraid, Seaquest, Skiing, StarGunner.* </details> </div> </div> | <img src="performances.png" alt="drawing" width="600px"/> |
| :-: | :-: |

## User installation
Python 3.10 is recommended. Create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:
```bash
python3.10 -m venv env
source env/bin/activate
pip install --upgrade pip
pip install numpy==1.23.5  # to avoid numpy==2.XX
pip install -r requirements.txt
pip install --upgrade "jax[cuda12_pip]==0.4.13" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
```

## Citing `iterated Q-Network`
```
@article{vincent2024iterated,
  title={Iterated $ Q $-Network: Beyond the One-Step Bellman Operator},
  author={Vincent, Th{\'e}o and Palenicek, Daniel and Belousov, Boris and Peters, Jan and D'Eramo, Carlo},
  journal={Transactions on Machine Learning Research},
  year={2025}
}
```