This model repository contains files in GGUF format for the Yi 34B LLaMA, compatible with LLaMA modeling, based on the work from the chargoddard/Yi-34B-Llama repository.
Based of the work of chargoddard's:
- Tensors have been renamed to match the standard LLaMA.
- Model can be loaded without trust_remote_code, but the tokenizer can not.
Converted & Quantized Files
Yi-34B-Llamafied Model Options
The following tables list the available Yi-34B-Llamafied model files with their respective quantization methods and characteristics.
Key:
- Size: File size relative to the original.
- Quality Loss: The amount of quality loss due to quantization.
Q-Method | File Name | Size | Quality Loss | Recommended |
---|---|---|---|---|
Q2 | Yi-34B-Llama_Q2_K | Smallest | Extreme (not recommended) | |
Q3 | Yi-34B-Llama_Q3_K_S | Very Small | Very High | |
Q3 | Yi-34B-Llama_Q3_K_M | Very Small | Very High | |
Q3 | Yi-34B-Llama_Q3_K_L | Small | Substantial | |
Q4 | Yi-34B-Llama_Q4_K_S | Small | Significant | |
Q4 | Yi-34B-Llama_Q4_K_M | Medium | Balanced | Recommended |
Q5 | Yi-34B-Llama_Q5_K_S | Large | Low | Recommended |
Q5 | Yi-34B-Llama_Q5_K_M | Large | Very Low | Recommended |
Q6 | Yi-34B-Llama_Q6_K | Very Large | Extremely Low | |
Q8 | Yi-34B-Llama_Q8_0 | Very Large | Extremely Low (not recommended) |
Please choose the model that best suits your needs based on the size and quality loss trade-offs.
- Downloads last month
- 11
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.