File size: 1,251 Bytes
60b701e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
---
license: mit
datasets:
- Bretagne/ofis_publik_br-fr
- Bretagne/OpenSubtitles_br_fr
- Bretagne/Autogramm_Breton_translation
language:
- fr
- br
base_model:
- facebook/m2m100_418M
pipeline_tag: translation
library_name: transformers
---
# Gallek
* A French -> Breton Translation Model called **Gallek** (meaning "French" in Breton).
* The current model version reached a **BLEU score of 50** after 10 epochs on a 20% split of the training set.
* Only monodirectionally fr->br fine-tuned for now.
* Training details available on the [GweLLM Github repository](https://github.com/blackccpie/GweLLM).
Sample test code:
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
modelcard = "amurienne/gallek-m2m100"
model = AutoModelForSeq2SeqLM.from_pretrained(modelcard)
tokenizer = AutoTokenizer.from_pretrained(modelcard)
translation_pipeline = pipeline("translation", model=model, tokenizer=tokenizer, src_lang='fr', tgt_lang='br', max_length=512, device="cpu")
french_text = "traduis de français en breton: j'apprends le breton à l'école."
result = translation_pipeline(french_text)
print(result[0]['translation_text'])
```
Demo is available on the [Gallek Space](https://huggingface.co/spaces/amurienne/Gallek) |