metadata
pipeline_tag: text-generation
This model repository contains files in GGUF format for the Yi 34B LLaMA, compatible with LLaMA modeling, based on the work from the chargoddard/Yi-34B-Llama repository.
Based of the work of chargoddard's:
- Tensors have been renamed to match the standard LLaMA.
- Model can be loaded without trust_remote_code, but the tokenizer can not.
Converted & Quantized Files
Q2
- Yi-34B-Llama_Q2_K.gguf with quantization method Q2_K (smallest, extreme quality loss - not recommended).
Q3
- Yi-34B-Llama_Q3_K_S.gguf with quantization method Q3_K_S (very small, very high quality loss).
- Yi-34B-Llama_Q3_K_M.gguf with quantization method Q3_K_M (very small, very high quality loss).
- Yi-34B-Llama_Q3_K_L.gguf with quantization method Q3_K_L (small, substantial quality loss).
Q4
- Yi-34B-Llama_Q4_K_S.gguf with quantization method Q4_K_S (small, significant quality loss).
- Yi-34B-Llama_Q4_K_M.gguf with quantization method Q4_K_M (medium, balanced quality - recommended).
Q5
- Yi-34B-Llama_Q5_K_S.gguf with quantization method Q5_K_S (large, low quality loss - recommended).
- Yi-34B-Llama_Q5_K_M.gguf with quantization method Q5_K_M (large, very low quality loss - recommended).
Q6
- Yi-34B-Llama_Q6_K.gguf with quantization method Q6_K (very large, extremely low quality loss).
Q8
- Yi-34B-Llama_Q8_0.gguf with quantization method Q8_0 (very large, extremely low quality loss - not recommended).