Update README.md

7c7f5fa verified about 1 month ago

4.4 kB

	---
	library_name: transformers
	license: mit
	datasets:
	- galsenai/centralized_wolof_french_translation_data
	language:
	- wo
	- fr
	base_model:
	- facebook/nllb-200-distilled-600M
	pipeline_tag: translation
	---

	# Model Card: NLLB-200 French-Wolof(🇫🇷↔️🇸🇳) Translation Model

	## Model Details

	### Model Description
	A fine-tuned version of Meta's NLLB-200 (600M distilled) model specialized for French to Wolof translation. This model was trained to improve accessibility of content between French and Wolof languages.

	- Developed by: Lahad
	- Model type: Sequence-to-Sequence Translation Model
	- Language(s): French (fr_Latn) ↔️ Wolof (wol_Latn)
	- License: CC-BY-NC-4.0
	- Finetuned from model: facebook/nllb-200-distilled-600M

	### Model Sources
	- Repository: [Hugging Face - Lahad/nllb200-francais-wolof](https://huggingface.co/Lahad/nllb200-francais-wolof)
	- GitHub: [Fine-tuning NLLB-200 for French-Wolof](https://github.com/LahadMbacke/Fine-tuning_facebook-nllb-200-distilled-600M_French_to_Wolof)

	## Uses

	### Direct Use
	- Text translation between French and Wolof
	- Content localization
	- Language learning assistance
	- Cross-cultural communication

	### Out-of-Scope Use
	- Commercial use without proper licensing
	- Translation of highly technical or specialized content
	- Legal or medical document translation where professional human translation is required
	- Real-time speech translation

	## Bias, Risks, and Limitations
	1. Language Variety Limitations:
	- Limited coverage of regional Wolof dialects
	- May not handle cultural nuances effectively

	2. Technical Limitations:
	- Maximum context window of 128 tokens
	- Reduced performance on technical/specialized content
	- May struggle with informal language and slang

	3. Potential Biases:
	- Training data may reflect cultural biases
	- May perform better on standard/formal language

	## Recommendations
	- Use for general communication and content translation
	- Verify translations for critical communications
	- Consider regional language variations
	- Implement human review for sensitive content
	- Test translations in intended context before deployment

	## How to Get Started with the Model

	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	# Load model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained("Lahad/nllb200-francais-wolof")
	model = AutoModelForSeq2SeqLM.from_pretrained("Lahad/nllb200-francais-wolof")

	# Translation function
	def translate(text, max_length=128):
	inputs = tokenizer(
	text,
	max_length=max_length,
	padding="max_length",
	truncation=True,
	return_tensors="pt"
	)

	outputs = model.generate(
	input_ids=inputs["input_ids"],
	attention_mask=inputs["attention_mask"],
	forced_bos_token_id=tokenizer.convert_tokens_to_ids("wol_Latn"),
	max_length=max_length
	)

	return tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	## Training Details

	### Training Data
	- Dataset: galsenai/centralized_wolof_french_translation_data
	- Split: 80% training, 20% testing
	- Format: JSON pairs of French and Wolof translations

	### Training Procedure
	#### Preprocessing
	- Dynamic tokenization with padding
	- Maximum sequence length: 128 tokens
	- Source/target language tags: fr_Latn/wol_Latn

	#### Training Hyperparameters
	- Learning rate: 2e-5
	- Batch size: 8 per device
	- Training epochs: 3
	- FP16 training: Enabled
	- Evaluation strategy: Per epoch

	## Evaluation

	### Testing Data, Factors & Metrics
	- Testing Data: 20% of dataset
	- Metrics:
	- Cloud Provider:
	- Evaluation Factors:
	- Translation accuracy
	- Semantic preservation
	- Grammar correctness

	## Environmental Impact
	- Hardware Type: NVIDIA T4 GPU
	- Hours used: 5
	- Cloud Provider: [Not Specified]
	- Compute Region: [Not Specified]
	- Carbon Emitted: [Not Calculated]

	## Technical Specifications

	### Model Architecture and Objective
	- Architecture: NLLB-200 (Distilled 600M version)
	- Objective: Neural Machine Translation
	- Parameters: 600M
	- Context Window: 128 tokens

	### Compute Infrastructure
	- Training Hardware: NVIDIA T4 GPU
	- Training Time: 5 hours
	- Software Framework: Hugging Face Transformers

	## Model Card Contact
	For questions about this model, please create an issue on the model's Hugging Face repository.