Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,16 @@ Run them in [LM Studio](https://lmstudio.ai/)
|
|
22 |
|
23 |
Run them directly with [llama.cpp](https://github.com/ggerganov/llama.cpp), or any other llama.cpp based project
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
## Prompt format
|
26 |
|
27 |
```
|
@@ -38,6 +48,8 @@ Run them directly with [llama.cpp](https://github.com/ggerganov/llama.cpp), or a
|
|
38 |
|
39 |
| Filename | Quant type | File Size | Split | Description |
|
40 |
| -------- | ---------- | --------- | ----- | ----------- |
|
|
|
|
|
41 |
| [gemma-3-12b-it-bf16.gguf](https://huggingface.co/bartowski/google_gemma-3-12b-it-GGUF/blob/main/google_gemma-3-12b-it-bf16.gguf) | bf16 | 23.54GB | false | Full BF16 weights. |
|
42 |
| [gemma-3-12b-it-Q8_0.gguf](https://huggingface.co/bartowski/google_gemma-3-12b-it-GGUF/blob/main/google_gemma-3-12b-it-Q8_0.gguf) | Q8_0 | 12.51GB | false | Extremely high quality, generally unneeded but max available quant. |
|
43 |
| [gemma-3-12b-it-Q6_K_L.gguf](https://huggingface.co/bartowski/google_gemma-3-12b-it-GGUF/blob/main/google_gemma-3-12b-it-Q6_K_L.gguf) | Q6_K_L | 9.90GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
|
|
|
22 |
|
23 |
Run them directly with [llama.cpp](https://github.com/ggerganov/llama.cpp), or any other llama.cpp based project
|
24 |
|
25 |
+
## Vision
|
26 |
+
|
27 |
+
This model has vision capabilities, more details here: https://github.com/ggml-org/llama.cpp/pull/12344
|
28 |
+
|
29 |
+
After building with Gemma 3 clip support, run the following command:
|
30 |
+
|
31 |
+
```
|
32 |
+
./build/bin/llama-gemma3-cli -m google_gemma-3-12b-it-Q8_0.gguf --mmproj mmproj-google_gemma-3-12b-it-f16.gguf
|
33 |
+
```
|
34 |
+
|
35 |
## Prompt format
|
36 |
|
37 |
```
|
|
|
48 |
|
49 |
| Filename | Quant type | File Size | Split | Description |
|
50 |
| -------- | ---------- | --------- | ----- | ----------- |
|
51 |
+
| [mmproj-gemma-3-12b-it-f32.gguf](https://huggingface.co/bartowski/google_gemma-3-12b-it-GGUF/blob/main/mmproj-google_gemma-3-12b-it-f32.gguf) | f32 | 1.68GB | false | F32 format MMPROJ file, required for vision. |
|
52 |
+
| [mmproj-gemma-3-12b-it-f16.gguf](https://huggingface.co/bartowski/google_gemma-3-12b-it-GGUF/blob/main/mmproj-google_gemma-3-12b-it-f16.gguf) | f16 | 851MB | false | F16 format MMPROJ file, required for vision. |
|
53 |
| [gemma-3-12b-it-bf16.gguf](https://huggingface.co/bartowski/google_gemma-3-12b-it-GGUF/blob/main/google_gemma-3-12b-it-bf16.gguf) | bf16 | 23.54GB | false | Full BF16 weights. |
|
54 |
| [gemma-3-12b-it-Q8_0.gguf](https://huggingface.co/bartowski/google_gemma-3-12b-it-GGUF/blob/main/google_gemma-3-12b-it-Q8_0.gguf) | Q8_0 | 12.51GB | false | Extremely high quality, generally unneeded but max available quant. |
|
55 |
| [gemma-3-12b-it-Q6_K_L.gguf](https://huggingface.co/bartowski/google_gemma-3-12b-it-GGUF/blob/main/google_gemma-3-12b-it-Q6_K_L.gguf) | Q6_K_L | 9.90GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
|