Update README.md
Browse files
README.md
CHANGED
@@ -63,6 +63,7 @@ This model was trained for 1000 steps (1.2 epochs) with the model being evaluate
|
|
63 |
We used the [qlora](https://github.com/artidoro/qlora) package from artidoro.
|
64 |
We trained with the following hyperparameters:
|
65 |
|
|
|
66 |
Per device evaluation batch size: 16
|
67 |
Per device train batch size: 8
|
68 |
LoRA (lora_r): 64
|
@@ -81,6 +82,7 @@ Adam beta2: 0.999
|
|
81 |
Maximum gradient norm: 0.3
|
82 |
LoRA dropout: 0.05
|
83 |
Weight decay: 0.0
|
|
|
84 |
|
85 |

|
86 |
|
|
|
63 |
We used the [qlora](https://github.com/artidoro/qlora) package from artidoro.
|
64 |
We trained with the following hyperparameters:
|
65 |
|
66 |
+
```
|
67 |
Per device evaluation batch size: 16
|
68 |
Per device train batch size: 8
|
69 |
LoRA (lora_r): 64
|
|
|
82 |
Maximum gradient norm: 0.3
|
83 |
LoRA dropout: 0.05
|
84 |
Weight decay: 0.0
|
85 |
+
```
|
86 |
|
87 |

|
88 |
|