ProdeusUnity
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -11,41 +11,36 @@ tags:
|
|
11 |
|
12 |
Listen to the song on youtube: https://www.youtube.com/watch?v=npyiiInMA0w
|
13 |
|
14 |
-
|
15 |
|
16 |
-
|
|
|
|
|
|
|
17 |
|
18 |
-
|
19 |
-
Sao10K/MN-12B-Lyra-v4
|
20 |
-
nothingiisreal/MN-12B-Starcannon-v2
|
21 |
-
Gryphe/Pantheon-RP-1.5-12b-Nemo
|
22 |
|
23 |
-
|
24 |
|
25 |
-
|
26 |
-
TO CLEAR SOME CONFUSION: Please use ChatML
|
27 |
-
~~I hope this was worth the time I spent to create this merge, lol~~
|
28 |
-
|
29 |
-
Gated access for now, gated access will be disabled when testing is done, and thanks to all who have interest.
|
30 |
|
31 |
Thank you to AuriAetherwiing for helping me merge the models.
|
32 |
|
33 |
-
|
34 |
-
|
35 |
|
36 |
-
This is a merge of
|
37 |
|
38 |
## Merge Details
|
39 |
### Merge Method
|
40 |
|
41 |
-
This model was merged using the della_linear merge method using
|
42 |
|
43 |
### Models Merged
|
44 |
|
45 |
The following models were included in the merge:
|
46 |
-
*
|
47 |
-
*
|
48 |
-
*
|
49 |
|
50 |
### Configuration
|
51 |
|
@@ -73,3 +68,10 @@ parameters:
|
|
73 |
merge_method: della_linear
|
74 |
dtype: bfloat16
|
75 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
Listen to the song on youtube: https://www.youtube.com/watch?v=npyiiInMA0w
|
13 |
|
14 |
+
This is my second attempt at a model merge, This time, these models were used
|
15 |
|
16 |
+
- mistralai/Mistral-Nemo-Base-2407
|
17 |
+
- Sao10K/MN-12B-Lyra-v4
|
18 |
+
- nothingiisreal/MN-12B-Starcannon-v2
|
19 |
+
- Gryphe/Pantheon-RP-1.5-12b-Nemo
|
20 |
|
21 |
+
License for this model is: Apache 2.0 (due to the base model, Mistral Nemo Base 2407)
|
|
|
|
|
|
|
22 |
|
23 |
+
Intended Use case: Roleplay
|
24 |
|
25 |
+
Instruction Format: ChatML
|
|
|
|
|
|
|
|
|
26 |
|
27 |
Thank you to AuriAetherwiing for helping me merge the models.
|
28 |
|
29 |
+
# Data?
|
|
|
30 |
|
31 |
+
This is a hard question to answer, I didn't add any data to the model itself, rather it's a merge of other models, so the data used for them applies to this model too, though it won't be the same.
|
32 |
|
33 |
## Merge Details
|
34 |
### Merge Method
|
35 |
|
36 |
+
This model was merged using the della_linear merge method using mistralai/Mistral-Nemo-Base-2407 as a base.
|
37 |
|
38 |
### Models Merged
|
39 |
|
40 |
The following models were included in the merge:
|
41 |
+
* Sao10K/MN-12B-Lyra-v4
|
42 |
+
* Gryphe/Pantheon-RP-1.5-12b-Nemo
|
43 |
+
* nothingiisreal/MN-12B-Starcannon-v2
|
44 |
|
45 |
### Configuration
|
46 |
|
|
|
68 |
merge_method: della_linear
|
69 |
dtype: bfloat16
|
70 |
```
|
71 |
+
|
72 |
+
## Notes
|
73 |
+
|
74 |
+
Della_Linear: Refer to https://arxiv.org/abs/2406.11617 and https://arxiv.org/abs/2212.04089, as it is quite long to explain what Della_Linear is
|
75 |
+
BFloat16: Brain Floating Point 16, a way to run models faster on Nvidia GPUs
|
76 |
+
Density: Fraction of weights in differences from the base model to retain
|
77 |
+
Epsilon: Maximum change in drop probability based on magnitude. Drop probabilities assigned will range
|