Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,80 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
+
|
5 |
+
|
6 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6434a6e8ea46c009904c617e/J_4FHXmtM6TuRnN3aL06y.png" width="38" height="38">
|
7 |
+
|
8 |
+
This is the Llama2 LoRA weight that was fine-tuned on **MUFFIN** (**Mu**lti-**F**aceted **In**structions).
|
9 |
+
|
10 |
+
We fine-tune the [Llama2-13B](https://huggingface.co/meta-llama/Llama-2-13b-hf) on [MUFFIN dataset](https://renzelou.github.io/Muffin/) with LoRA (low-rank adaption).
|
11 |
+
|
12 |
+
We released the LoRA weights of both Llama2 7B and 13B models:
|
13 |
+
|Model|LoRA Target Modules|
|
14 |
+
|-|-|
|
15 |
+
|[MUFFIN-Llama2-7B](https://huggingface.co/Reza8848/MUFFIN-Llama2-lora-7B)|`Q, K, V, O`|
|
16 |
+
|[MUFFIN-Llama2-13B](https://huggingface.co/Reza8848/MUFFIN-Llama2-lora-13B)|`Q, K, V, O`|
|
17 |
+
|
18 |
+
|
19 |
+
|
20 |
+
|
21 |
+
## Model Usage
|
22 |
+
|
23 |
+
### 1. Inference code
|
24 |
+
|
25 |
+
We use [Alpaca-lora](https://github.com/tloen/alpaca-lora) as our fine-tuning code.
|
26 |
+
|
27 |
+
So, when adopting the released model weights for inference, it should be better to use the [generation code](https://github.com/tloen/alpaca-lora/blob/main/generate.py) of Alpaca-lora to reproduce our performance.
|
28 |
+
|
29 |
+
|
30 |
+
Please follow the document of Alpaca-lora to set up the **correct Python environments first**.
|
31 |
+
|
32 |
+
|
33 |
+
> Our released lora weights are in **`.safetensors`** format rather than the common **`.bin`** torch model files.
|
34 |
+
> Wrong transformers and torch versions may result in [PEFT compatibility errors](https://github.com/huggingface/transformers/issues/27397) when using the released lora weighs.
|
35 |
+
|
36 |
+
|
37 |
+
### 2. Prompt template
|
38 |
+
|
39 |
+
Please use the following prompt template (save the following dict as a JSON file under ['template' folder](https://github.com/tloen/alpaca-lora/tree/main/templates)):
|
40 |
+
|
41 |
+
```json
|
42 |
+
{
|
43 |
+
"description": "Template used by muffin.",
|
44 |
+
"prompt_input": "### Input:\n{input}\n\n### Instruction:\n{instruction}\n\n### Response:\n",
|
45 |
+
"prompt_no_input": "### Input:\nNone\n\n### Instruction:\n{instruction}\n\n### Response:\n",
|
46 |
+
"response_split": "### Response:"
|
47 |
+
}
|
48 |
+
```
|
49 |
+
|
50 |
+
### 3. Generation hyper-parameters
|
51 |
+
|
52 |
+
We use the default generation hyper-parameters as identified in [this line](https://github.com/tloen/alpaca-lora/blob/main/generate.py#L90).
|
53 |
+
|
54 |
+
Besides, be aware of the following hyper-parameters:
|
55 |
+
- `eval_batch_size == 1`. **Using batched inference (eval_batch_size > 1) will result in a weird performance**.
|
56 |
+
- `max_input_len == 1024`. This is the max_input_len of training. But it's fine to use any length in the inference since our evaluation batch size is 1.
|
57 |
+
- `num_beams == 1`. In our experiments, we set beam size to 1. But we recommend you try with a larger beam size to get better responses from models.
|
58 |
+
|
59 |
+
|
60 |
+
## Zero-Shot Evaluation Performances
|
61 |
+
|
62 |
+
We use the [metric calculation scripts](https://github.com/yizhongw/Tk-Instruct/blob/main/src/compute_metrics.py) of [Tk-Instruct](https://github.com/yizhongw/Tk-Instruct/tree/main) (i.e., `ROUGE-L` and `Exact-Match`).
|
63 |
+
|
64 |
+
<div style="text-align:center"><img src="https://cdn-uploads.huggingface.co/production/uploads/6434a6e8ea46c009904c617e/IjeMYWLMRO_qGOOiXxemP.png" alt="performances.png" width="600"/></div>
|
65 |
+
|
66 |
+
|
67 |
+
|
68 |
+
## 🥳 Citation
|
69 |
+
|
70 |
+
Please kindly cite our paper if you use any resources in this repository:
|
71 |
+
|
72 |
+
```bibtex
|
73 |
+
@inproceedings{Lou2023MUFFIN,
|
74 |
+
title={{MUFFIN}: Curating Multi-Faceted Instructions for Improving Instruction Following},
|
75 |
+
author={Renze Lou and Kai Zhang and Jian Xie and Yuxuan Sun and Janice Ahn and Hanzi Xu and Yu su and Wenpeng Yin},
|
76 |
+
booktitle={The Twelfth International Conference on Learning Representations},
|
77 |
+
year={2024},
|
78 |
+
url={https://openreview.net/forum?id=1vrS1zwekw}
|
79 |
+
}
|
80 |
+
```
|