Steelskull commited on
Commit
014ab89
·
verified ·
1 Parent(s): ebcddf3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -0
README.md CHANGED
@@ -19,6 +19,48 @@ as it is unknown (at this time) what the merge has done to the context length.
19
  This is a merge of both VerA and VerB of Etheria-55b (There numbers were surprisingly good), I then created a sacrificial 55B out of the most performant yi-34b-200k Model
20
  and performed a Dare_ties merge and equalize the model into its current state.
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ### Merge Method
23
 
24
  This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using Merged-Etheria-55b as a base.
 
19
  This is a merge of both VerA and VerB of Etheria-55b (There numbers were surprisingly good), I then created a sacrificial 55B out of the most performant yi-34b-200k Model
20
  and performed a Dare_ties merge and equalize the model into its current state.
21
 
22
+ ### recommended settings and Prompt Format:
23
+
24
+ Ive tested it up to 32k context using exl2 using these settings:
25
+
26
+ ```
27
+ "temp": 0.7,
28
+ "temperature_last": true,
29
+ "top_p": 1,
30
+ "top_k": 0,
31
+ "top_a": 0,
32
+ "tfs": 1,
33
+ "epsilon_cutoff": 0,
34
+ "eta_cutoff": 0,
35
+ "typical_p": 1,
36
+ "min_p": 0.1,
37
+ "rep_pen": 1.1,
38
+ "rep_pen_range": 8192,
39
+ "no_repeat_ngram_size": 0,
40
+ "penalty_alpha": 0,
41
+ "num_beams": 1,
42
+ "length_penalty": 1,
43
+ "min_length": 0,
44
+ "encoder_rep_pen": 1,
45
+ "freq_pen": 0,
46
+ "presence_pen": 0,
47
+ "do_sample": true,
48
+ "early_stopping": false,
49
+ "add_bos_token": false,
50
+ "truncation_length": 2048,
51
+ "ban_eos_token": true,
52
+ "skip_special_tokens": true,
53
+ "streaming": true,
54
+ "mirostat_mode": 0,
55
+ "mirostat_tau": 5,
56
+ "mirostat_eta": 0.1,
57
+ ```
58
+
59
+ Prompt format that work well
60
+ ```
61
+ ChatML & Alpaca
62
+ ```
63
+
64
  ### Merge Method
65
 
66
  This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using Merged-Etheria-55b as a base.