Update README.md
Browse files
README.md
CHANGED
|
@@ -11,19 +11,19 @@ model-index:
|
|
| 11 |
results:
|
| 12 |
- metrics:
|
| 13 |
- type: FAS (J=1)
|
| 14 |
-
value: 0.
|
| 15 |
name: FAS
|
| 16 |
- type: FAS (J=2)
|
| 17 |
-
value: 0.
|
| 18 |
name: FAS
|
| 19 |
- type: FAS (J=4)
|
| 20 |
-
value: 0.
|
| 21 |
name: FAS
|
| 22 |
- type: FAS (J=8)
|
| 23 |
-
value: 0.
|
| 24 |
name: FAS
|
| 25 |
- type: FAS (J=16)
|
| 26 |
-
value: 0.
|
| 27 |
name: FAS
|
| 28 |
task:
|
| 29 |
type: OpenAI Gym
|
|
@@ -36,57 +36,17 @@ model-index:
|
|
| 36 |
---
|
| 37 |
# Soft-Actor-Critic: Walker2d-v2
|
| 38 |
|
| 39 |
-
These are 25 trained models over **seeds (0-4)**
|
| 40 |
|
| 41 |
## Model Sources
|
| 42 |
|
| 43 |
**Repository:** [https://github.com/dee0512/Sequence-Reinforcement-Learning](https://github.com/dee0512/Sequence-Reinforcement-Learning)
|
| 44 |
**Paper (ICLR):** [https://openreview.net/forum?id=w3iM4WLuvy](https://openreview.net/forum?id=w3iM4WLuvy)
|
| 45 |
-
**Arxiv:** [arxiv.org/pdf/2410.08979](https://arxiv.org/pdf/2410.08979)
|
| 46 |
|
| 47 |
-
|
| 48 |
-
Using the repository:
|
| 49 |
-
|
| 50 |
-
```
|
| 51 |
-
python .\train_sac.py --env_name <env_name> --seed <seed> --j <j>
|
| 52 |
-
```
|
| 53 |
-
|
| 54 |
-
# Evaluation:
|
| 55 |
|
| 56 |
-
Download the models folder and place it in the same directory as the cloned repository.
|
| 57 |
Using the repository:
|
| 58 |
|
| 59 |
-
```
|
| 60 |
-
python
|
| 61 |
-
```
|
| 62 |
-
|
| 63 |
-
## Metrics:
|
| 64 |
-
|
| 65 |
-
**FAS:** Frequency Averaged Score
|
| 66 |
-
**j:** Action repetition parameter
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
# Citation
|
| 70 |
-
|
| 71 |
-
The paper can be cited with the following bibtex entry:
|
| 72 |
-
|
| 73 |
-
## BibTeX:
|
| 74 |
-
|
| 75 |
-
```
|
| 76 |
-
@inproceedings{DBLP:conf/iclr/PatelS25,
|
| 77 |
-
author = {Devdhar Patel and
|
| 78 |
-
Hava T. Siegelmann},
|
| 79 |
-
title = {Overcoming Slow Decision Frequencies in Continuous Control: Model-Based
|
| 80 |
-
Sequence Reinforcement Learning for Model-Free Control},
|
| 81 |
-
booktitle = {The Thirteenth International Conference on Learning Representations,
|
| 82 |
-
{ICLR} 2025, Singapore, April 24-28, 2025},
|
| 83 |
-
publisher = {OpenReview.net},
|
| 84 |
-
year = {2025},
|
| 85 |
-
url = {https://openreview.net/forum?id=w3iM4WLuvy}
|
| 86 |
-
}
|
| 87 |
-
```
|
| 88 |
-
|
| 89 |
-
## APA:
|
| 90 |
-
```
|
| 91 |
-
Patel, D., & Siegelmann, H. T. Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control. In The Thirteenth International Conference on Learning Representations.
|
| 92 |
-
```
|
|
|
|
| 11 |
results:
|
| 12 |
- metrics:
|
| 13 |
- type: FAS (J=1)
|
| 14 |
+
value: 0.070768 ± 0.011055
|
| 15 |
name: FAS
|
| 16 |
- type: FAS (J=2)
|
| 17 |
+
value: 0.083818 ± 0.025049
|
| 18 |
name: FAS
|
| 19 |
- type: FAS (J=4)
|
| 20 |
+
value: 0.137035 ± 0.042001
|
| 21 |
name: FAS
|
| 22 |
- type: FAS (J=8)
|
| 23 |
+
value: 0.232737 ± 0.065282
|
| 24 |
name: FAS
|
| 25 |
- type: FAS (J=16)
|
| 26 |
+
value: 0.150935 ± 0.043573
|
| 27 |
name: FAS
|
| 28 |
task:
|
| 29 |
type: OpenAI Gym
|
|
|
|
| 36 |
---
|
| 37 |
# Soft-Actor-Critic: Walker2d-v2
|
| 38 |
|
| 39 |
+
These are 25 trained models over **seeds (0-4)** and **J = 1, 2, 4, 8, 16** of a **Soft Actor Critic (SAC)** agent playing **Walker2d-v2** from **[Sequence Reinforcement Learning (SRL)](https://github.com/dee0512/Sequence-Reinforcement-Learning)**.
|
| 40 |
|
| 41 |
## Model Sources
|
| 42 |
|
| 43 |
**Repository:** [https://github.com/dee0512/Sequence-Reinforcement-Learning](https://github.com/dee0512/Sequence-Reinforcement-Learning)
|
| 44 |
**Paper (ICLR):** [https://openreview.net/forum?id=w3iM4WLuvy](https://openreview.net/forum?id=w3iM4WLuvy)
|
| 45 |
+
**Arxiv:** [https://arxiv.org/pdf/2410.08979](https://arxiv.org/pdf/2410.08979)
|
| 46 |
|
| 47 |
+
## Training Details
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
|
|
|
|
| 49 |
Using the repository:
|
| 50 |
|
| 51 |
+
```bash
|
| 52 |
+
python ./train_sac.py --env_name Walker2d-v2 --seed <seed> --j <j>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|