failspy
/

llama-3-70B-Instruct-abliterated-GGUF

Inference Endpoints

Model card Files Files and versions Community

failspy commited on May 7, 2024

Commit

f7755ab

·

verified ·

1 Parent(s): 6e1ab76

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -11,12 +11,15 @@ TL;DR: this model has had certain weights manipulated to "inhibit" the model's a
 ## GGUF quants
 Uploaded quants:
 fp16 (in main) - good for converting to other platforms or getting the quantization you actually want, not recommended
 q8_0 (in own branch) - if you've got the spare capacity, might as well?
 q6_0 (in own branch) - this will probably be the best balance in terms of quality/performance
 q4 (in main) - recommended for ~48GB VRAM setups
 q3 (in main) - decent quality
 q2 (in main) - surprisingly decent quality
 ## For the people who like tinkering or looking to save bandwidth

 ## GGUF quants
 Uploaded quants:
 fp16 (in main) - good for converting to other platforms or getting the quantization you actually want, not recommended
 q8_0 (in own branch) - if you've got the spare capacity, might as well?
 q6_0 (in own branch) - this will probably be the best balance in terms of quality/performance
 q4 (in main) - recommended for ~48GB VRAM setups
 q3 (in main) - decent quality
 q2 (in main) - surprisingly decent quality
 ## For the people who like tinkering or looking to save bandwidth