Duino
/

Darija-LM

Text Generation

Moroccan Arabic

Inference Endpoints

Model card Files Files and versions Community

Darija-LM / README.md

Duino's picture

Upload README.md with huggingface_hub

bb38df8 verified 29 days ago

|

1.48 kB


	# Multilingual GPT Model (Byte-Level)

	This model is a multilingual GPT model trained on byte-level encodings of Wikipedia articles in Arabic (ar) and Egyptian Arabic (ary).

	Model Details:
	- Trained using a byte-level vocabulary (size: 32000).
	- Architecture: Transformer-based GPT model.
	- Languages: Arabic (ar), Egyptian Arabic (ary).
	- Training Data: Streamed Wikipedia dataset (limited to 10000 articles per language).
	- Training Code: [Link to your training script/GitHub repo if available]

	Usage:

	[Provide instructions on how to load and use the model. E.g., using `torch.load` and the provided `GPTLanguageModel` class.]

	Example (Conceptual - Adapt to your actual loading process):

	```python
	import torch
	from your_model_definition_script import GPTLanguageModel # Assuming you save model definition

	# Initialize model architecture (must be defined in a separate script)
	model = GPTLanguageModel()
	model.load_state_dict(torch.load('model_weights.pth')) # Load from local if downloaded from HF
	model.eval()

	# ... (rest of your inference code) ...
	```

	Training Hyperparameters:
	- Batch Size: 32
	- Block Size: 256
	- Embedding Dimension: 384
	- Number of Heads: 6
	- Number of Layers: 6
	- Dropout: 0.2
	- Optimizer: AdamW
	- Learning Rate: 0.0006
	- Max Iterations: 5000

	Loss Curve:
	[You can optionally add a link or embed the training plot image here]

	License:
	[Specify your license, e.g., MIT License]

	Contact:
	[Your name/contact information]