Gallek

  • A French -> Breton Translation Model called Gallek (meaning "French" in Breton).
  • The current model version reached a BLEU score of 40 on a 20% split of the training set.
  • Only monodirectionally fr->br fine-tuned for now.
  • Training details available on the GweLLM Github repository.

Sample test code:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

modelcard = "amurienne/gallek-m2m100"

model = AutoModelForSeq2SeqLM.from_pretrained(modelcard)
tokenizer = AutoTokenizer.from_pretrained(modelcard)

translation_pipeline = pipeline("translation", model=model, tokenizer=tokenizer, src_lang='fr', tgt_lang='br', max_length=512, device="cpu")

french_text = "traduis de français en breton: j'apprends le breton à l'école."

result = translation_pipeline(french_text)
print(result[0]['translation_text'])

Demo is available on the Gallek Space

Downloads last month
114
Safetensors
Model size
484M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for amurienne/gallek-m2m100

Finetuned
(63)
this model

Datasets used to train amurienne/gallek-m2m100

Space using amurienne/gallek-m2m100 1