Rutooro-Centric Multilingual Translation Model

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-mul that specializes in translating from English to Rutooro and other East African languages.

Model Description

This translation model focuses specifically on Rutooro while maintaining high quality for other East African languages including Luganda, Acholi, and Runyankore. It was fine-tuned on a carefully curated dataset containing thousands of translation pairs across multiple languages, with special emphasis on rows where Rutooro translations were present.

Supported Languages

The model primarily supports translation from English to:

  • Rutooro (Ugandan language spoken by the Batooro people)
  • Luganda (Most widely spoken Ugandan language)
  • Acholi (Nilotic language spoken in Northern Uganda and South Sudan)
  • Runyankore (Language spoken in southwestern Uganda)

Other languages from the base model may also work but with varying quality.

Usage

To use this model for translation:

from transformers import pipeline

# Initialize the translation pipeline
translator = pipeline("translation", model="MubarakB/rutooro-multilingual-translator")

# Translate to Rutooro
text = "Education is important for community development."
rutooro_translation = translator(f">>rutooro<< {text}")
print(f"Rutooro: {rutooro_translation[0]['translation_text']}")

# Translate to other supported languages
luganda_translation = translator(f">>luganda<< {text}")
print(f"Luganda: {luganda_translation[0]['translation_text']}")

acholi_translation = translator(f">>acholi<< {text}")
print(f"Acholi: {acholi_translation[0]['translation_text']}")

runyankore_translation = translator(f">>runyankore<< {text}")
print(f"Runyankore: {runyankore_translation[0]['translation_text']}")

Language Tokens

When using this model, you must prefix your input text with the appropriate language token:

  • >>rutooro<< - For Rutooro translation
  • >>luganda<< - For Luganda translation
  • >>acholi<< - For Acholi translation
  • >>runyankore<< - For Runyankore translation

Example Translations

English Rutooro Luganda Acholi Runyankore
Education is important for development. Okusoma nikwomuhendo ahabw'okukulaakulana. Okusoma kikulu nnyo mu nkulaakulana. Kwan dongo pire me yubo lobo. Okushoma nikukuru ahabw'okukulaakulana.
Mobile phones have transformed communication in rural areas. Esimu zabyemikono zihindwireho enkoragana omubicweka byakyaro. Essimu ezitambulizibwa mu ngalo zikyusizza eby'empuliziganya mu byalo. Simu latic me cing ocele kit me kwat lok i gang me tung. Amasimu g'ebyemikono gakyusizza empuliziganya mu byalo.
The market opens early in the morning. Akatale kagurwaho kare omumakya. Akatale kabbika mu makya. Gang cuk yabedo labongo ikare me ice. Akatale kakingirweho makya.
Women play a crucial role in community development. Abakazzi nibakora mulimo gwa mughaso ngu kukulakulanya ekyaro. Abakazi balina ekifo ekikulu mu nkulaakulana y'eggwanga. Mon ni tii tic ma kwako alokaloka me kom kin gang. Abakazi bakola omulimu murungi mu nkulaakulana y'ekitundu.

Model Details

  • Base Model: Helsinki-NLP/opus-mt-en-mul
  • Model Type: Sequence-to-Sequence (Encoder-Decoder Transformer)
  • Training Data: Multilingual dataset with focus on Rutooro translations
  • Fine-tuning: Targeted fine-tuning with special emphasis on Rutooro language pairs
  • Languages Coverage:
    • Rutooro (11.75% of dataset)
    • Luganda (99.86% of dataset)
    • Acholi (99.87% of dataset)
    • Runyankore (99.87% of dataset)

Limitations

  • The model is optimized for general conversational text and may not perform as well on highly specialized or technical content
  • Performance may vary based on language coverage in the training data
  • Quality can vary based on sentence complexity and domain
  • Some languages may benefit from additional fine-tuning with more domain-specific data

Citation

If you use this model in your research, please cite:

@misc{rutooro-multilingual-translator,
  author = {Mubarak Bachu},
  title = {Rutooro-Centric Multilingual Translation Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/MubarakB/rutooro-multilingual-translator}}
}

Acknowledgments

This model builds upon the excellent work by Helsinki-NLP and the Opus-MT project. Special thanks to the communities supporting the preservation and computational processing of East African languages.

Downloads last month
17
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for MubarakB/rutooro-multilingual-translator

Finetuned
(3)
this model