File size: 3,618 Bytes
84e8bca b84811c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
---
language:
- en
- ar
library_name: openvino
pipeline_tag: text-generation
license: apache-2.0
base_model: inceptionai/jais-13b
tags:
- openvino
- optimized
- int4
- awq
- bilingual
- arabic
- english
- jais
---
# Jais-13B OpenVINO INT4
This repository contains the [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model...
# Jais-13B OpenVINO INT4
This repository contains the [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model optimized for inference with Intel's OpenVINO runtime. The model has been quantized to INT4 using the AWQ quantization scheme for improved performance while maintaining quality.
## Model Details
* **Original Model**: [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b)
* **Model Type**: Bilingual (Arabic-English) Large Language Model
* **Parameters**: 13B
* **OpenVINO Version**: 2024.0+
* **Quantization**: INT4 Symmetric AWQ (Activation-aware Weight Quantization)
* **Group Size**: -1 (per-channel quantization)
Jais-13B is a bilingual model that supports both Arabic and English text generation. The model can:
- Generate fluent text in both Arabic and English
- Respond to prompts in either language
- Handle code-switching between the two languages
## Optimization Details
This model was converted from the original Hugging Face model to OpenVINO format using the Optimum Intel library. The following optimization command was used:
```bash
optimum-cli export openvino \
-m inceptionai/jais-13b \
--weight-format int4 \
--sym \
--dataset auto \
--awq \
--group-size -1 \
--trust-remote-code \
jais-13b-int4-sym-ov
```
### Optimization Parameters:
- **INT4 Quantization**: Weights compressed to 4-bit integers
- **Symmetric Quantization**: Using symmetric quantization for better accuracy
- **AWQ**: Activation-aware Weight Quantization to preserve model quality
- **Auto Dataset**: Used automatic dataset sampling for calibration
- **Group Size**: -1 (quantize each output channel independently)
- **Trust Remote Code**: Enabled to support custom model code
## Usage
### Prerequisites
- OpenVINO 2024.0 or newer
- optimum-intel
- transformers
### Sample Inference code with Optimum Intel
```python
from optimum.intel import OVModelForCausalLM
from transformers import AutoTokenizer
# Load tokenizer and model
model_id = "rpanchum/jais-13b-int4-sym-ov"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = OVModelForCausalLM.from_pretrained(model_id)
# Generate text
prompt = "Write a short story about a robot learning to paint:"
input_ids = tokenizer(prompt, return_tensors="pt")
output = model.generate(
**input_ids,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
)
response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)
```
### Alternative: Using OpenVINO GenAI
1. Install packages required for using OpenVINO GenAI.
```bash
pip install openvino-genai huggingface_hub
```
2. Download model and run inference.
```python
import huggingface_hub as hf_hub
model_id = "rpanchum/jais-13b-int4-sym-ov"
model_path = "jais-13b-int4-sym-ov"
hf_hub.snapshot_download(model_id, local_dir=model_path)
import openvino_genai as ov_genai
device = "CPU"
pipe = ov_genai.LLMPipeline(model_path, device)
print(pipe.generate("ما هو الذكاء الاصطناعي؟", max_length=200)) # "What is AI?" in Arabic
print(pipe.generate("What is artificial intelligence?", max_length=200))
```
## License
This model inherits the license of the original [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model.
|