File size: 1,549 Bytes
8d85441
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
pipeline_tag: text-generation
---

**This model repository contains files in GGUF format for the Yi 34B LLaMA, compatible with LLaMA modeling, based on the work from the [chargoddard/Yi-34B-Llama](https://huggingface.co/chargoddard/Yi-34B-Llama) repository.**

Based of the work of chargoddard's:

  -  Tensors have been renamed to match the standard LLaMA.
  -  Model can be loaded without trust_remote_code, but the tokenizer can not.

**Converted & Quantized Files**

### Q2
- Yi-34B-Llama_Q2_K.gguf with quantization method Q2_K *(smallest, extreme quality loss - not recommended)*.

### Q3
- Yi-34B-Llama_Q3_K_S.gguf with quantization method Q3_K_S *(very small, very high quality loss)*.
- Yi-34B-Llama_Q3_K_M.gguf with quantization method Q3_K_M *(very small, very high quality loss)*.
- Yi-34B-Llama_Q3_K_L.gguf with quantization method Q3_K_L *(small, substantial quality loss)*.

### Q4
- Yi-34B-Llama_Q4_K_S.gguf with quantization method Q4_K_S *(small, significant quality loss)*.
- Yi-34B-Llama_Q4_K_M.gguf with quantization method Q4_K_M *(medium, balanced quality - recommended)*.

### Q5
- Yi-34B-Llama_Q5_K_S.gguf with quantization method Q5_K_S *(large, low quality loss - recommended)*.
- Yi-34B-Llama_Q5_K_M.gguf with quantization method Q5_K_M *(large, very low quality loss - recommended)*.

### Q6
- Yi-34B-Llama_Q6_K.gguf with quantization method Q6_K *(very large, extremely low quality loss)*.

### Q8
- Yi-34B-Llama_Q8_0.gguf with quantization method Q8_0 *(very large, extremely low quality loss - not recommended)*.