Yi-34B-Llama-GGUF / README.md
simuarc
Create README.md
8d85441
|
raw
history blame
1.55 kB
metadata
pipeline_tag: text-generation

This model repository contains files in GGUF format for the Yi 34B LLaMA, compatible with LLaMA modeling, based on the work from the chargoddard/Yi-34B-Llama repository.

Based of the work of chargoddard's:

  • Tensors have been renamed to match the standard LLaMA.
  • Model can be loaded without trust_remote_code, but the tokenizer can not.

Converted & Quantized Files

Q2

  • Yi-34B-Llama_Q2_K.gguf with quantization method Q2_K (smallest, extreme quality loss - not recommended).

Q3

  • Yi-34B-Llama_Q3_K_S.gguf with quantization method Q3_K_S (very small, very high quality loss).
  • Yi-34B-Llama_Q3_K_M.gguf with quantization method Q3_K_M (very small, very high quality loss).
  • Yi-34B-Llama_Q3_K_L.gguf with quantization method Q3_K_L (small, substantial quality loss).

Q4

  • Yi-34B-Llama_Q4_K_S.gguf with quantization method Q4_K_S (small, significant quality loss).
  • Yi-34B-Llama_Q4_K_M.gguf with quantization method Q4_K_M (medium, balanced quality - recommended).

Q5

  • Yi-34B-Llama_Q5_K_S.gguf with quantization method Q5_K_S (large, low quality loss - recommended).
  • Yi-34B-Llama_Q5_K_M.gguf with quantization method Q5_K_M (large, very low quality loss - recommended).

Q6

  • Yi-34B-Llama_Q6_K.gguf with quantization method Q6_K (very large, extremely low quality loss).

Q8

  • Yi-34B-Llama_Q8_0.gguf with quantization method Q8_0 (very large, extremely low quality loss - not recommended).