Update README.md
Browse files
README.md
CHANGED
@@ -11,12 +11,15 @@ TL;DR: this model has had certain weights manipulated to "inhibit" the model's a
|
|
11 |
## GGUF quants
|
12 |
Uploaded quants:
|
13 |
fp16 (in main) - good for converting to other platforms or getting the quantization you actually want, not recommended
|
|
|
14 |
q8_0 (in own branch) - if you've got the spare capacity, might as well?
|
15 |
|
16 |
q6_0 (in own branch) - this will probably be the best balance in terms of quality/performance
|
17 |
|
18 |
q4 (in main) - recommended for ~48GB VRAM setups
|
|
|
19 |
q3 (in main) - decent quality
|
|
|
20 |
q2 (in main) - surprisingly decent quality
|
21 |
|
22 |
## For the people who like tinkering or looking to save bandwidth
|
|
|
11 |
## GGUF quants
|
12 |
Uploaded quants:
|
13 |
fp16 (in main) - good for converting to other platforms or getting the quantization you actually want, not recommended
|
14 |
+
|
15 |
q8_0 (in own branch) - if you've got the spare capacity, might as well?
|
16 |
|
17 |
q6_0 (in own branch) - this will probably be the best balance in terms of quality/performance
|
18 |
|
19 |
q4 (in main) - recommended for ~48GB VRAM setups
|
20 |
+
|
21 |
q3 (in main) - decent quality
|
22 |
+
|
23 |
q2 (in main) - surprisingly decent quality
|
24 |
|
25 |
## For the people who like tinkering or looking to save bandwidth
|