Sparse model parameters on 10 Atari games
The sparse model parameters were obtained with EauDeQN and PolyPruneQN leading to EauDeDQN
and PolyPruneDQN
in an online scenario, and leading to EauDeCQL
and PolyPruneCQL
in an offline scenario ๐ฎ While PolyPruneQN applies a fixed polynomial pruning schedule to reach a final sparsity level of 95%, EauDeQN prunes the network parameters at the agents learning pace ๐ชก
We also release the model parameters for the dense approach using DQN and CQL ๐๏ธ The online training were made for 40M frames and the offline training were made for 50 $\times$ 62 500 gradient steps โฑ๏ธ. We used the CNN architecture, where the number of neurons of the first linear layer is reported as "Feature Size":
Training type | Feature Size: | 32 (Small) | 512 (Medium) | 2048 (Large) |
---|---|---|---|---|
Online | EauDeDQN |
โ | โ | โ |
Online | PolyPruneDQN |
โ | โ | โ |
Online | DQN (dense) |
โ | โ | โ |
Offline | EauDeCQL |
โ | โ | |
Offline | PolyPruneCQL |
โ | โ | |
Offline | CQL (dense) |
โ | โ |
5 seeds are available for each configuration which makes a total of 750 available models ๐.
The evaluate.ipynb notebook contains a minimal example to evaluate to model parameters ๐งโ๐ซ It uses JAX ๐ The hyperparameters used during training are reported in online_config.json and offline_config.json ๐ง
The training code is available soon โณ
Model sparsity & performances
EauDeDQN and EauDeCQL achieve high sparsity while keeping performances high. Published at RLDMโจ List of Atari gamesBeamRider, MsPacman, Qbert, Pong, Enduro, SpaceInvaders, Assault, CrazyClimber, Boxing, VideoPinball. |
![]() |
---|
The episodic returns and lenghts are available in the evaluations folder ๐ฌ
User installation
Python 3.11 is recommended. Create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:
python3 -m venv env
source env/bin/activate
pip install --upgrade pip setuptools wheel
pip install -r requirements.txt
Citing Eau De Q-Network
@inproceedings{Vincent_CRLDM_2025,
title={Eau De $ Q $-Network: Adaptive Distillation of Neural Networks in Deep Reinforcement Learning},
author={Vincent, Th{\'e}o and Faust, Tim and Tripathi, Yogesh and Peters, Jan and D'Eramo, Carlo},
booktitle={Conference on Reinforcement Learning and Decision Making (RLDM)},
year={2025}
}