File size: 2,833 Bytes
05a32a6 1161bd7 41a6c31 05a32a6 2ce2a51 05a32a6 2ce2a51 0a57e78 2ce2a51 05a32a6 2ce2a51 05a32a6 aee2236 05a32a6 2ce2a51 05a32a6 2ce2a51 05a32a6 2ce2a51 05a32a6 2ce2a51 05a32a6 2ce2a51 05a32a6 2ce2a51 05a32a6 2ce2a51 05a32a6 2ce2a51 0a57e78 2ce2a51 0a57e78 b42360d 3dc7c82 1161bd7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
library_name: transformers
tags:
- government
- conversational
- question-answering
- dutch
- geitje
license: apache-2.0
datasets:
- Nelis5174473/Dutch-QA-Pairs-Rijksoverheid
language:
- nl
pipeline_tag: text-generation
---
<p align="center" style="margin:0;padding:0">
<img src="https://cdn-uploads.huggingface.co/production/uploads/65e04544f59f66e0e072dc5c/b-OsZLNJtPHMwzbgwmGlV.png" alt="GovLLM Ultra banner" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
</p>
<div style="margin:auto; text-align:center">
<h1 style="margin-bottom: 0">GovLLM-7B-ultra</h1>
<em>A question answering model about the Dutch Government.</em>
</div>
## Model description
This model is a fine-tuned version of the Dutch conversational model [BramVanroy/GEITje-7B-ULTRA](https://huggingface.co/BramVanroy/GEITje-7B-ultra) on a [Dutch question-answer pair dataset](https://huggingface.co/datasets/Nelis5174473/Dutch-QA-Pairs-Rijksoverheid) of the Dutch Government. This is a Dutch question/answer model ultimately based on Mistral and fine-tuned with SFT and LoRA. The training with 3 epochs took almost 2 hours and was run on an Nvidia A100 (40GB VRAM).
# Usage with Inference Endpoints (Dedicated)
```python
import requests
API_URL = "https://your-own-endpoint.us-east-1.aws.endpoints.huggingface.cloud"
headers = {"Authorization": "Bearer hf_your_own_token"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "Geeft de overheid subsidie aan bedrijven?"
})
# print generated answer
print(output[0]['generated_text'])
```
## Training hyperparameters
The following hyperparameters were used during training:
- block_size: 1024,
- model_max_length: 2048,
- padding: right,
- mixed_precision: fp16,
- learning rate (lr): 0.00003,
- epochs: 3,
- batch_size: 2,
- optimizer: adamw_torch,
- schedular: linear,
- quantization: int8,
- peft: true,
- lora_r: 16,
- lora_alpha: 16,
- lora_dropout: 0.05
### Training results
| Epoch | Loss | Grad_norm | learning_rate | step |
|:------:|---------:|:----------:|:-------------:|:--------:|
| 0.14 | 1.3183 | 0.6038 | 1.3888e-05 | 25/540 |
| 0.42 | 1.0220 | 0.4180 | 2.8765e-05 | 75/540 |
| 0.69 | 0.9251 | 0.4119 | 2.56793-05 | 125/540 |
| 0.97 | 0.9260 | 0.4682 | 2.2592e-05 | 175/540 |
| 1.25 | 0.8586 | 0.5338 | 1.9506e-05 | 225/540 |
| 1.53 | 0.8767 | 0.6359 | 1.6420e-05 | 275/540 |
| 1.80 | 0.8721 | 0.6137 | 1.3333e-05 | 325/540 |
| 2.08 | 0.8469 | 0.7310 | 1.0247e-05 | 375/540 |
| 2.36 | 0.8324 | 0.7945 | 7.1605e-05 | 425/540 |
| 2.64 | 0.8170 | 0.8522 | 4.0741e-05 | 475/540 |
| 2.91 | 0.8185 | 0.8562 | 9.8765e-05 | 525/540 | |