Translation
Transformers
Safetensors
French
Breton
m2m_100
text2text-generation
Inference Endpoints
File size: 1,392 Bytes
d5b1d1c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
license: mit
datasets:
- Bretagne/ofis_publik_br-fr
- Bretagne/OpenSubtitles_br_fr
- Bretagne/Autogramm_Breton_translation
language:
- fr
- br
base_model:
- facebook/m2m100_418M
pipeline_tag: translation
library_name: transformers
---

# Kellag

* A Breton -> French Translation Model called **Kellag**.
* Kellag is the temporary "brother" model of [Gallek](https://huggingface.co/amurienne/gallek-m2m100), since a bidirectional fr <-> br model is not ready yet (WIP).
* The current model version reached a **BLEU score of 50** after 10 epochs on a 20% split of the training set.
* Only monodirectionally br->fr fine-tuned for now.
* Training details available on the [GweLLM Github repository](https://github.com/blackccpie/GweLLM).

Sample test code:
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

modelcard = "amurienne/kellag-m2m100"

model = AutoModelForSeq2SeqLM.from_pretrained(modelcard)
tokenizer = AutoTokenizer.from_pretrained(modelcard)

translation_pipeline = pipeline("translation", model=model, tokenizer=tokenizer, src_lang='br', tgt_lang='fr', max_length=512, device="cpu")

breton_text = "treiñ eus ar brezhoneg d'ar galleg: deskiñ a ran brezhoneg er skol."

result = translation_pipeline(breton_text)
print(result[0]['translation_text'])
```

Demo is available on the [Gallek Space](https://huggingface.co/spaces/amurienne/Gallek)