nielsr HF staff commited on
Commit
ec20bfe
ยท
verified ยท
1 Parent(s): 96c02bc

Add library_name, fix pipeline_tag

Browse files

This PR improves the model card by ensuring:

- there's a proper `pipeline_tag`, ensuring the model can be found at https://huggingface.co/models?pipeline_tag=reinforcement-learning
- the proper library is added, enabling "how to use" in the top right.

Files changed (1) hide show
  1. README.md +11 -8
README.md CHANGED
@@ -1,6 +1,9 @@
1
  ---
2
  license: mit
 
 
3
  ---
 
4
  <div align="center">
5
  <div style="margin-bottom: 30px"> <!-- ๅ‡ๅฐ‘ๅบ•้ƒจ้—ด่ท -->
6
  <div style="display: flex; flex-direction: column; align-items: center; gap: 8px"> <!-- ๆ–ฐๅขžๅž‚็›ดๅธƒๅฑ€ๅฎนๅ™จ -->
@@ -10,15 +13,15 @@ license: mit
10
  </div>
11
  <h2 style="font-size: 32px; margin: 20px 0;">Skill Expansion and Composition in Parameter Space</h2>
12
  <h4 style="color: #666; margin-bottom: 25px;">International Conference on Learning Representation (ICLR), 2025</h4>
13
- <p align="center" style="margin: 20px 0;">
14
  <a href="https://arxiv.org/abs/2502.05932">
15
  <img src="https://img.shields.io/badge/arXiv-2502.05932-b31b1b.svg">
16
  </a>
17
- <!-- &nbsp;&nbsp; -->
18
  <a href="https://ltlhuuu.github.io/PSEC/">
19
  <img src="https://img.shields.io/badge/๐ŸŒ_Project_Page-PSEC-blue.svg">
20
  </a>
21
- <!-- &nbsp;&nbsp; -->
22
  <a href="https://arxiv.org/pdf/2502.05932.pdf">
23
  <img src="https://img.shields.io/badge/๐Ÿ“‘_Paper-PSEC-green.svg">
24
  </a>
@@ -31,10 +34,10 @@ license: mit
31
  ๐Ÿ”ฅ Official Implementation
32
  </p>
33
  <p style="font-size: 18px; max-width: 800px; margin: 0 auto;">
34
- <b>PSEC</b> is a novel framework designed to:
35
  </p>
36
  </div>
37
- <div align="center">
38
  <p style="font-size: 15px; font-weight: 600; margin-bottom: 20px;">
39
  ๐Ÿš€ <b>Facilitate</b> efficient and flexible skill expansion and composition <br>
40
  ๐Ÿ”„ <b>Iteratively evolve</b> the agents' capabilities<br>
@@ -99,18 +102,18 @@ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco210/bin
99
  ```
100
  ## Run experiments
101
  ### Pretrain
102
- Pretrain the model with the following command. Meanwhile there are pre-trained models, you can download them from [here](https://drive.google.com/drive/folders/1lpcShmYoKVt4YMH66JBiA0MhYEV9aEYy?usp=sharing).
103
  ```python
104
  export XLA_PYTHON_CLIENT_PREALLOCATE=False
105
  CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_pretrain.py --variant 0 --seed 0
106
  ```
107
  ### LoRA finetune
108
- Train the skill policies with LoRA to achieve skill expansion. Meanwhile there are pre-trained models, you can download them from [here](https://drive.google.com/drive/folders/1lpcShmYoKVt4YMH66JBiA0MhYEV9aEYy?usp=sharing).
109
  ```python
110
  CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_lora_finetune.py --com_method 0 --model_cls 'LoRALearner' --variant 0 --seed 0
111
  ```
112
  ### Context-aware Composition
113
- Train the context-aware modular to adaptively leverage different skill knowledge to solve the tasks. You can download the pretrained model and datasets from [here](https://drive.google.com/drive/folders/1lpcShmYoKVt4YMH66JBiA0MhYEV9aEYy?usp=sharing). Then, run the following command,
114
  ```python
115
  CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_lora_finetune.py --com_method 0 --model_cls 'LoRASLearner' --variant 0 --seed 0
116
  ```
 
1
  ---
2
  license: mit
3
+ library_name: diffusers
4
+ pipeline_tag: reinforcement-learning
5
  ---
6
+
7
  <div align="center">
8
  <div style="margin-bottom: 30px"> <!-- ๅ‡ๅฐ‘ๅบ•้ƒจ้—ด่ท -->
9
  <div style="display: flex; flex-direction: column; align-items: center; gap: 8px"> <!-- ๆ–ฐๅขžๅž‚็›ดๅธƒๅฑ€ๅฎนๅ™จ -->
 
13
  </div>
14
  <h2 style="font-size: 32px; margin: 20px 0;">Skill Expansion and Composition in Parameter Space</h2>
15
  <h4 style="color: #666; margin-bottom: 25px;">International Conference on Learning Representation (ICLR), 2025</h4>
16
+ <p align="center" style="margin: 30px 0;">
17
  <a href="https://arxiv.org/abs/2502.05932">
18
  <img src="https://img.shields.io/badge/arXiv-2502.05932-b31b1b.svg">
19
  </a>
20
+ &nbsp;&nbsp;
21
  <a href="https://ltlhuuu.github.io/PSEC/">
22
  <img src="https://img.shields.io/badge/๐ŸŒ_Project_Page-PSEC-blue.svg">
23
  </a>
24
+ &nbsp;&nbsp;
25
  <a href="https://arxiv.org/pdf/2502.05932.pdf">
26
  <img src="https://img.shields.io/badge/๐Ÿ“‘_Paper-PSEC-green.svg">
27
  </a>
 
34
  ๐Ÿ”ฅ Official Implementation
35
  </p>
36
  <p style="font-size: 18px; max-width: 800px; margin: 0 auto;">
37
+ <img src="assets/icon.svg" width="20"> <b>PSEC</b> is a novel framework designed to:
38
  </p>
39
  </div>
40
+ <div align="left">
41
  <p style="font-size: 15px; font-weight: 600; margin-bottom: 20px;">
42
  ๐Ÿš€ <b>Facilitate</b> efficient and flexible skill expansion and composition <br>
43
  ๐Ÿ”„ <b>Iteratively evolve</b> the agents' capabilities<br>
 
102
  ```
103
  ## Run experiments
104
  ### Pretrain
105
+ Pretrain the model with the following command. Meanwhile there are pre-trained models, you can download them from [here](https://huggingface.co/LTL07/PSEC).
106
  ```python
107
  export XLA_PYTHON_CLIENT_PREALLOCATE=False
108
  CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_pretrain.py --variant 0 --seed 0
109
  ```
110
  ### LoRA finetune
111
+ Train the skill policies with LoRA to achieve skill expansion. Meanwhile there are pre-trained models, you can download them from [here](https://huggingface.co/LTL07/PSEC).
112
  ```python
113
  CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_lora_finetune.py --com_method 0 --model_cls 'LoRALearner' --variant 0 --seed 0
114
  ```
115
  ### Context-aware Composition
116
+ Train the context-aware modular to adaptively leverage different skill knowledge to solve the tasks. You can download the pretrained model and datasets from [here](https://huggingface.co/LTL07/PSEC). Then, run the following command,
117
  ```python
118
  CUDA_VISIBLE_DEVICES=0 python launcher/examples/train_lora_finetune.py --com_method 0 --model_cls 'LoRASLearner' --variant 0 --seed 0
119
  ```