Sparse model parameters on 10 Atari games

The sparse model parameters were obtained with EauDeQN and PolyPruneQN leading to EauDeDQN and PolyPruneDQN in an online scenario, and leading to EauDeCQL and PolyPruneCQL in an offline scenario ๐ŸŽฎ While PolyPruneQN applies a fixed polynomial pruning schedule to reach a final sparsity level of 95%, EauDeQN prunes the network parameters at the agents learning pace ๐Ÿชก

We also release the model parameters for the dense approach using DQN and CQL ๐Ÿ‹๏ธ The online training were made for 40M frames and the offline training were made for 50 $\times$ 62 500 gradient steps โฑ๏ธ. We used the CNN architecture, where the number of neurons of the first linear layer is reported as "Feature Size":

Training type Feature Size: 32 (Small) 512 (Medium) 2048 (Large)
Online EauDeDQN โœ… โœ… โœ…
Online PolyPruneDQN โœ… โœ… โœ…
Online DQN (dense) โœ… โœ… โœ…
Offline EauDeCQL โœ… โœ…
Offline PolyPruneCQL โœ… โœ…
Offline CQL (dense) โœ… โœ…

5 seeds are available for each configuration which makes a total of 750 available models ๐Ÿ“ˆ.

The evaluate.ipynb notebook contains a minimal example to evaluate to model parameters ๐Ÿง‘โ€๐Ÿซ It uses JAX ๐Ÿš€ The hyperparameters used during training are reported in online_config.json and offline_config.json ๐Ÿ”ง

The training code is available soon โณ

Model sparsity & performances

EauDeDQN and EauDeCQL achieve high sparsity while keeping performances high.
Published at RLDMโœจ
List of Atari games BeamRider, MsPacman, Qbert, Pong, Enduro, SpaceInvaders, Assault, CrazyClimber, Boxing, VideoPinball.
drawing

The episodic returns and lenghts are available in the evaluations folder ๐Ÿ”ฌ

User installation

Python 3.11 is recommended. Create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:

python3 -m venv env
source env/bin/activate
pip install --upgrade pip setuptools wheel
pip install -r requirements.txt

Citing Eau De Q-Network

@inproceedings{Vincent_CRLDM_2025,
    title={Eau De $ Q $-Network: Adaptive Distillation of Neural Networks in Deep Reinforcement Learning},
    author={Vincent, Th{\'e}o and Faust, Tim and Tripathi, Yogesh and Peters, Jan and D'Eramo, Carlo},
    booktitle={Conference on Reinforcement Learning and Decision Making (RLDM)},
    year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading