Melvin56 commited on
Commit
268a5be
Β·
verified Β·
1 Parent(s): f721fef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -20
README.md CHANGED
@@ -8,33 +8,43 @@ pipeline_tag: text-generation
8
  developers: Kanana LLM
9
  training_regime: bf16 mixed precision
10
  base_model:
11
- - kakaocorp/kanana-nano-2.1b-instruct
12
  tags:
13
  - abliterated
14
  - uncensored
15
  ---
 
16
 
17
- # huihui-ai/kanana-nano-2.1b-instruct-abliterated
18
 
 
19
 
20
- This is an uncensored version of [kakaocorp/kanana-nano-2.1b-instruct](https://huggingface.co/kakaocorp/kanana-nano-2.1b-instruct) created with abliteration (see [remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers) to know more about it).
21
- This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
22
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
- ## Use with ollama
25
-
26
- You can use [huihui_ai/kanana-nano-abliterated](https://ollama.com/huihui_ai/kanana-nano-abliterated) directly
27
- ```
28
- ollama run huihui_ai/kanana-nano-abliterated
29
- ```
30
-
31
- ### Donation
32
-
33
- If you like it, please click 'like' and follow us for more updates.
34
- You can follow [x.com/support_huihui](https://x.com/support_huihui) to get the latest model information from huihui.ai.
35
-
36
- ##### Your donation helps us continue our further development and improvement, a cup of coffee can do it.
37
- - bitcoin:
38
- ```
39
- bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge
40
  ```
 
 
 
 
 
 
 
 
 
 
 
 
8
  developers: Kanana LLM
9
  training_regime: bf16 mixed precision
10
  base_model:
11
+ - huihui-ai/kanana-nano-2.1b-instruct-abliterated
12
  tags:
13
  - abliterated
14
  - uncensored
15
  ---
16
+ # Melvin56/kanana-nano-2.1b-instruct-abliterated-GGUF
17
 
18
+ Original Model : [huihui-ai/kanana-nano-2.1b-instruct-abliterated](https://huggingface.co/huihui-ai/kanana-nano-2.1b-instruct-abliterated)
19
 
20
+ All quants are made using the imatrix dataset.
21
 
 
 
22
 
23
+ | Model | Size (GB) |
24
+ |:-------------------------------------------------|:-------------:|
25
+ | Q2_K_S | 0.914 |
26
+ | Q2_K | 0.931 |
27
+ | Q3_K_M | 1.138 |
28
+ | Q4_K_M | 1.385 |
29
+ | Q5_K_M | 1.568 |
30
+ | Q6_K | 1.826 |
31
+ | Q8_0 | 2.223 |
32
+ | F16 | 4.177 |
33
+ | F32 | 8.342 |
34
 
35
+ | | CPU (AVX2) | CPU (ARM NEON) | Metal | cuBLAS | rocBLAS | SYCL | CLBlast | Vulkan | Kompute |
36
+ | :------------ | :---------: | :------------: | :---: | :----: | :-----: | :---: | :------: | :----: | :------: |
37
+ | K-quants | βœ… | βœ… | βœ… | βœ… | βœ… | βœ… | βœ… 🐒5 | βœ… 🐒5 | ❌ |
38
+ | I-quants | βœ… 🐒4 | βœ… 🐒4 | βœ… 🐒4 | βœ… | βœ… | PartialΒΉ | ❌ | ❌ | ❌ |
 
 
 
 
 
 
 
 
 
 
 
 
39
  ```
40
+ βœ…: feature works
41
+ 🚫: feature does not work
42
+ ❓: unknown, please contribute if you can test it youself
43
+ 🐒: feature is slow
44
+ ΒΉ: IQ3_S and IQ1_S, see #5886
45
+ Β²: Only with -ngl 0
46
+ Β³: Inference is 50% slower
47
+ ⁴: Slower than K-quants of comparable size
48
+ ⁡: Slower than cuBLAS/rocBLAS on similar cards
49
+ ⁢: Only q8_0 and iq4_nl
50
+ ```