Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
# Multilingual GPT Model (Byte-Level)
|
3 |
+
|
4 |
+
This model is a multilingual GPT model trained on byte-level encodings of Wikipedia articles in Arabic (ar) and Egyptian Arabic (ary).
|
5 |
+
|
6 |
+
**Model Details:**
|
7 |
+
- Trained using a byte-level vocabulary (size: 32000).
|
8 |
+
- Architecture: Transformer-based GPT model.
|
9 |
+
- Languages: Arabic (ar), Egyptian Arabic (ary).
|
10 |
+
- Training Data: Streamed Wikipedia dataset (limited to 10000 articles per language).
|
11 |
+
- Training Code: [Link to your training script/GitHub repo if available]
|
12 |
+
|
13 |
+
**Usage:**
|
14 |
+
|
15 |
+
[Provide instructions on how to load and use the model. E.g., using `torch.load` and the provided `GPTLanguageModel` class.]
|
16 |
+
|
17 |
+
**Example (Conceptual - Adapt to your actual loading process):**
|
18 |
+
|
19 |
+
```python
|
20 |
+
import torch
|
21 |
+
from your_model_definition_script import GPTLanguageModel # Assuming you save model definition
|
22 |
+
|
23 |
+
# Initialize model architecture (must be defined in a separate script)
|
24 |
+
model = GPTLanguageModel()
|
25 |
+
model.load_state_dict(torch.load('model_weights.pth')) # Load from local if downloaded from HF
|
26 |
+
model.eval()
|
27 |
+
|
28 |
+
# ... (rest of your inference code) ...
|
29 |
+
```
|
30 |
+
|
31 |
+
**Training Hyperparameters:**
|
32 |
+
- Batch Size: 32
|
33 |
+
- Block Size: 256
|
34 |
+
- Embedding Dimension: 384
|
35 |
+
- Number of Heads: 6
|
36 |
+
- Number of Layers: 6
|
37 |
+
- Dropout: 0.2
|
38 |
+
- Optimizer: AdamW
|
39 |
+
- Learning Rate: 0.0006
|
40 |
+
- Max Iterations: 5000
|
41 |
+
|
42 |
+
**Loss Curve:**
|
43 |
+
[You can optionally add a link or embed the training plot image here]
|
44 |
+
|
45 |
+
**License:**
|
46 |
+
[Specify your license, e.g., MIT License]
|
47 |
+
|
48 |
+
**Contact:**
|
49 |
+
[Your name/contact information]
|