File size: 1,372 Bytes
8252875 deafab1 aa75674 8252875 deafab1 8252875 deafab1 aa75674 31c8e99 aa75674 deafab1 aa75674 deafab1 aa75674 deafab1 8252875 aa75674 deafab1 aa75674 8252875 deafab1 aa75674 deafab1 8252875 deafab1 aa75674 8252875 deafab1 8252875 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
---
library_name: transformers
base_model: meta-llama/Meta-Llama-3.1-70B-Instruct
license: llama3.1
model-index:
- name: Meta-Llama-3.1-70B-Instruct-INT8
results: []
language:
- en
- de
- fr
- it
- pt
- hi
- es
- th
tags:
- facebook
- meta
- pytorch
- llama
- llama-3
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This is a quantized version of `Llama 3.1 70B Instruct`. Quantized to **8-bit** using `bistandbytes` and `accelerate`.
- **Developed by:** Farid Saud @ DSRS
- **License:** llama3.1
- **Base Model:** meta-llama/Meta-Llama-3.1-70B-Instruct
## Use this model
Use a pipeline as a high-level helper:
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="fsaudm/Meta-Llama-3.1-70B-Instruct-INT8")
pipe(messages)
```
Load model directly
```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("fsaudm/Meta-Llama-3.1-70B-Instruct-INT8")
model = AutoModelForCausalLM.from_pretrained("fsaudm/Meta-Llama-3.1-70B-Instruct-INT8")
```
The base model information can be found in the original [meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)
|