File size: 3,311 Bytes
6cfe808
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d8bf92
6cfe808
 
 
 
 
 
517459d
6cfe808
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
base_model: AtlaAI/Selene-1-Mini-Llama-3.1-8B
library_name: transformers
language:
- en
- de
- fr
- it
- pt
- es
pipeline_tag: text-generation
tags:
- llama
- atla
- evaluation
- llm-as-a-judge
- meta
- conversational
- lm-judge
- llama-cpp
- gptq
license: llama3.1
---


<p align="center">
  <picture>
    <source 
      srcset="https://atla-ai.notion.site/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2Ff08e6e70-73af-4363-9621-90e906b92ebc%2F1bfb4316-1ce6-40a0-800c-253739cfcdeb%2Fatla_white3x.svg?table=block&id=17c309d1-7745-80f9-8f60-e755409acd8d&spaceId=f08e6e70-73af-4363-9621-90e906b92ebc&userId=&cache=v2"
      media="(prefers-color-scheme: dark)"
      width="200"
    />
    <source 
      srcset="https://atla-ai.notion.site/image/attachment%3A230448e8-921f-45df-b2af-a3158b6c04cd%3Aatla_black2x.png?table=block&id=188309d1-7745-805c-87e4-c39ca54d598d&spaceId=f08e6e70-73af-4363-9621-90e906b92ebc&width=2000&userId=&cache=v2"
      media="(prefers-color-scheme: light)"
      width="200"
    />
    <img 
      src="https://atla-ai.notion.site/image/attachment%3A230448e8-921f-45df-b2af-a3158b6c04cd%3Aatla_black2x.png?table=block&id=188309d1-7745-805c-87e4-c39ca54d598d&spaceId=f08e6e70-73af-4363-9621-90e906b92ebc&width=2000&userId=&cache=v2"
      width="200"
    />
  </picture>
</p>
<p align="center">
    πŸ› <a href="https://hf.co/spaces/AtlaAI/selene">Playground</a> | 
    πŸ“„ <a href="https://huggingface.co/spaces/AtlaAI/selene-1-mini-tech-report">Technical report</a> | 
    πŸ’» <a href="https://github.com/atla-ai/selene-mini">GitHub</a> | 
    πŸ‘€ <a href="https://www.atla-ai.com/sign-up-waitlist?utm_source=huggingface&utm_medium=community&utm_campaign=WL_HF_all_communitypost_sel1minilaunch" style="background-image: linear-gradient(to right, red, orange, yellow, green, blue, indigo, violet); -webkit-background-clip: text; color: transparent; animation: rainbow 5s ease infinite; text-decoration: underline; text-decoration-color: currentColor;">Sign up for the API</a>
</p>

<style>
@keyframes rainbow {
    0% { background-position: 0% 50%; }
    50% { background-position: 100% 50%; }
    100% { background-position: 0% 50%; }
}
</style>

<style>
@keyframes rainbow {
    0% { background-position: 0% 50%; }
    50% { background-position: 100% 50%; }
    100% { background-position: 0% 50%; }
}
</style>

# AtlaAI/Selene-1-Mini-Llama-3.1-8B-GPTQ-W4A16
This model was quantised into a **4-bit** (W4A16) format using GPTQ from [`AtlaAI/Selene-1-Mini-Llama-3.1-8B`](https://huggingface.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B).
This was done using vLLM's llm-compressor library (https://docs.vllm.ai/en/latest/features/quantization/int4.html)

Refer to the [original model card](https://huggingface.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B) for more details on the model.

This quantisation was calibrated using a sample of 512 datapoints from the data used to train Selene-1-Mini.
As a result, our quantised models show minimal performance degradation, losing <0.5% overall across benchmarks!

For reference, a GPTQ quantized 8-bit [Llama-3.1-8B](neuralmagic/Meta-Llama-3.1-8B-quantized.w8a8) shows ~1.5% degradation across benchmarks.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/633c4fb100732349209f2aad/K-455q4bMkd11-0P4XSdc.png)