Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,48 @@ as it is unknown (at this time) what the merge has done to the context length.
|
|
19 |
This is a merge of both VerA and VerB of Etheria-55b (There numbers were surprisingly good), I then created a sacrificial 55B out of the most performant yi-34b-200k Model
|
20 |
and performed a Dare_ties merge and equalize the model into its current state.
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
### Merge Method
|
23 |
|
24 |
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using Merged-Etheria-55b as a base.
|
|
|
19 |
This is a merge of both VerA and VerB of Etheria-55b (There numbers were surprisingly good), I then created a sacrificial 55B out of the most performant yi-34b-200k Model
|
20 |
and performed a Dare_ties merge and equalize the model into its current state.
|
21 |
|
22 |
+
### recommended settings and Prompt Format:
|
23 |
+
|
24 |
+
Ive tested it up to 32k context using exl2 using these settings:
|
25 |
+
|
26 |
+
```
|
27 |
+
"temp": 0.7,
|
28 |
+
"temperature_last": true,
|
29 |
+
"top_p": 1,
|
30 |
+
"top_k": 0,
|
31 |
+
"top_a": 0,
|
32 |
+
"tfs": 1,
|
33 |
+
"epsilon_cutoff": 0,
|
34 |
+
"eta_cutoff": 0,
|
35 |
+
"typical_p": 1,
|
36 |
+
"min_p": 0.1,
|
37 |
+
"rep_pen": 1.1,
|
38 |
+
"rep_pen_range": 8192,
|
39 |
+
"no_repeat_ngram_size": 0,
|
40 |
+
"penalty_alpha": 0,
|
41 |
+
"num_beams": 1,
|
42 |
+
"length_penalty": 1,
|
43 |
+
"min_length": 0,
|
44 |
+
"encoder_rep_pen": 1,
|
45 |
+
"freq_pen": 0,
|
46 |
+
"presence_pen": 0,
|
47 |
+
"do_sample": true,
|
48 |
+
"early_stopping": false,
|
49 |
+
"add_bos_token": false,
|
50 |
+
"truncation_length": 2048,
|
51 |
+
"ban_eos_token": true,
|
52 |
+
"skip_special_tokens": true,
|
53 |
+
"streaming": true,
|
54 |
+
"mirostat_mode": 0,
|
55 |
+
"mirostat_tau": 5,
|
56 |
+
"mirostat_eta": 0.1,
|
57 |
+
```
|
58 |
+
|
59 |
+
Prompt format that work well
|
60 |
+
```
|
61 |
+
ChatML & Alpaca
|
62 |
+
```
|
63 |
+
|
64 |
### Merge Method
|
65 |
|
66 |
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using Merged-Etheria-55b as a base.
|