# Multilingual GPT Model (Byte-Level) This model is a multilingual GPT model trained on byte-level encodings of Wikipedia articles in Arabic (ar) and Egyptian Arabic (ary). **Model Details:** - Trained using a byte-level vocabulary (size: 32000). - Architecture: Transformer-based GPT model. - Languages: Arabic (ar), Egyptian Arabic (ary). - Training Data: Streamed Wikipedia dataset (limited to 10000 articles per language). - Training Code: [Link to your training script/GitHub repo if available] **Usage:** [Provide instructions on how to load and use the model. E.g., using `torch.load` and the provided `GPTLanguageModel` class.] **Example (Conceptual - Adapt to your actual loading process):** ```python import torch from your_model_definition_script import GPTLanguageModel # Assuming you save model definition # Initialize model architecture (must be defined in a separate script) model = GPTLanguageModel() model.load_state_dict(torch.load('model_weights.pth')) # Load from local if downloaded from HF model.eval() # ... (rest of your inference code) ... ``` **Training Hyperparameters:** - Batch Size: 32 - Block Size: 256 - Embedding Dimension: 384 - Number of Heads: 6 - Number of Layers: 6 - Dropout: 0.2 - Optimizer: AdamW - Learning Rate: 0.0006 - Max Iterations: 5000 **Loss Curve:** [You can optionally add a link or embed the training plot image here] **License:** [Specify your license, e.g., MIT License] **Contact:** [Your name/contact information]