Rutooro-Centric Multilingual Translation Model
This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-mul that specializes in translating from English to Rutooro and other East African languages.
Model Description
This translation model focuses specifically on Rutooro while maintaining high quality for other East African languages including Luganda, Acholi, and Runyankore. It was fine-tuned on a carefully curated dataset containing thousands of translation pairs across multiple languages, with special emphasis on rows where Rutooro translations were present.
Supported Languages
The model primarily supports translation from English to:
- Rutooro (Ugandan language spoken by the Batooro people)
- Luganda (Most widely spoken Ugandan language)
- Acholi (Nilotic language spoken in Northern Uganda and South Sudan)
- Runyankore (Language spoken in southwestern Uganda)
Other languages from the base model may also work but with varying quality.
Usage
To use this model for translation:
from transformers import pipeline
# Initialize the translation pipeline
translator = pipeline("translation", model="MubarakB/rutooro-multilingual-translator")
# Translate to Rutooro
text = "Education is important for community development."
rutooro_translation = translator(f">>rutooro<< {text}")
print(f"Rutooro: {rutooro_translation[0]['translation_text']}")
# Translate to other supported languages
luganda_translation = translator(f">>luganda<< {text}")
print(f"Luganda: {luganda_translation[0]['translation_text']}")
acholi_translation = translator(f">>acholi<< {text}")
print(f"Acholi: {acholi_translation[0]['translation_text']}")
runyankore_translation = translator(f">>runyankore<< {text}")
print(f"Runyankore: {runyankore_translation[0]['translation_text']}")
Language Tokens
When using this model, you must prefix your input text with the appropriate language token:
>>rutooro<<
- For Rutooro translation>>luganda<<
- For Luganda translation>>acholi<<
- For Acholi translation>>runyankore<<
- For Runyankore translation
Example Translations
English | Rutooro | Luganda | Acholi | Runyankore |
---|---|---|---|---|
Education is important for development. | Okusoma nikwomuhendo ahabw'okukulaakulana. | Okusoma kikulu nnyo mu nkulaakulana. | Kwan dongo pire me yubo lobo. | Okushoma nikukuru ahabw'okukulaakulana. |
Mobile phones have transformed communication in rural areas. | Esimu zabyemikono zihindwireho enkoragana omubicweka byakyaro. | Essimu ezitambulizibwa mu ngalo zikyusizza eby'empuliziganya mu byalo. | Simu latic me cing ocele kit me kwat lok i gang me tung. | Amasimu g'ebyemikono gakyusizza empuliziganya mu byalo. |
The market opens early in the morning. | Akatale kagurwaho kare omumakya. | Akatale kabbika mu makya. | Gang cuk yabedo labongo ikare me ice. | Akatale kakingirweho makya. |
Women play a crucial role in community development. | Abakazzi nibakora mulimo gwa mughaso ngu kukulakulanya ekyaro. | Abakazi balina ekifo ekikulu mu nkulaakulana y'eggwanga. | Mon ni tii tic ma kwako alokaloka me kom kin gang. | Abakazi bakola omulimu murungi mu nkulaakulana y'ekitundu. |
Model Details
- Base Model: Helsinki-NLP/opus-mt-en-mul
- Model Type: Sequence-to-Sequence (Encoder-Decoder Transformer)
- Training Data: Multilingual dataset with focus on Rutooro translations
- Fine-tuning: Targeted fine-tuning with special emphasis on Rutooro language pairs
- Languages Coverage:
- Rutooro (11.75% of dataset)
- Luganda (99.86% of dataset)
- Acholi (99.87% of dataset)
- Runyankore (99.87% of dataset)
Limitations
- The model is optimized for general conversational text and may not perform as well on highly specialized or technical content
- Performance may vary based on language coverage in the training data
- Quality can vary based on sentence complexity and domain
- Some languages may benefit from additional fine-tuning with more domain-specific data
Citation
If you use this model in your research, please cite:
@misc{rutooro-multilingual-translator,
author = {Mubarak Bachu},
title = {Rutooro-Centric Multilingual Translation Model},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/MubarakB/rutooro-multilingual-translator}}
}
Acknowledgments
This model builds upon the excellent work by Helsinki-NLP and the Opus-MT project. Special thanks to the communities supporting the preservation and computational processing of East African languages.
- Downloads last month
- 17
Model tree for MubarakB/rutooro-multilingual-translator
Base model
Helsinki-NLP/opus-mt-en-mul