Add comprehensive model card for Meta-Llama-3-8B-Instruct fine-tuned on xLAM
Browse files
README.md
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
---
|
| 2 |
license: cc-by-nc-4.0
|
| 3 |
tags:
|
|
@@ -31,7 +32,7 @@ This is a fine-tuned version of the Meta-Llama-3-8B-Instruct model. The model wa
|
|
| 31 |
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct
|
| 32 |
- **Model size:** Meta-Llama-3-8B-Instruct parameters
|
| 33 |
- **Vocab size:** 128,256 tokens
|
| 34 |
-
- **Max sequence length:**
|
| 35 |
- **Tensor type:** BF16
|
| 36 |
- **Pad token:** `<|eot_id|>` (ID: 128009)
|
| 37 |
|
|
@@ -47,11 +48,11 @@ The model was fine-tuned using the following configuration:
|
|
| 47 |
|
| 48 |
### Training Parameters
|
| 49 |
- **Learning Rate:** 0.0001
|
| 50 |
-
- **Batch Size:**
|
| 51 |
-
- **Gradient Accumulation Steps:**
|
| 52 |
-
- **Max Training Steps:**
|
| 53 |
- **Warmup Ratio:** 0.1
|
| 54 |
-
- **Max Sequence Length:**
|
| 55 |
- **Output Directory:** ./Meta_Llama_3_8B_Instruct_xLAM
|
| 56 |
|
| 57 |
### LoRA Configuration
|
|
|
|
| 1 |
+
|
| 2 |
---
|
| 3 |
license: cc-by-nc-4.0
|
| 4 |
tags:
|
|
|
|
| 32 |
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct
|
| 33 |
- **Model size:** Meta-Llama-3-8B-Instruct parameters
|
| 34 |
- **Vocab size:** 128,256 tokens
|
| 35 |
+
- **Max sequence length:** 2,048 tokens
|
| 36 |
- **Tensor type:** BF16
|
| 37 |
- **Pad token:** `<|eot_id|>` (ID: 128009)
|
| 38 |
|
|
|
|
| 48 |
|
| 49 |
### Training Parameters
|
| 50 |
- **Learning Rate:** 0.0001
|
| 51 |
+
- **Batch Size:** 16
|
| 52 |
+
- **Gradient Accumulation Steps:** 8
|
| 53 |
+
- **Max Training Steps:** 1,000
|
| 54 |
- **Warmup Ratio:** 0.1
|
| 55 |
+
- **Max Sequence Length:** 2,048
|
| 56 |
- **Output Directory:** ./Meta_Llama_3_8B_Instruct_xLAM
|
| 57 |
|
| 58 |
### LoRA Configuration
|