Text Generation
Transformers
PyTorch
Japanese
llama
text-generation-inference
Inference Endpoints
ptrdvn commited on
Commit
3cefbdb
·
1 Parent(s): 8da1560

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -63,6 +63,7 @@ This model was trained for 1000 steps (1.2 epochs) with the model being evaluate
63
  We used the [qlora](https://github.com/artidoro/qlora) package from artidoro.
64
  We trained with the following hyperparameters:
65
 
 
66
  Per device evaluation batch size: 16
67
  Per device train batch size: 8
68
  LoRA (lora_r): 64
@@ -81,6 +82,7 @@ Adam beta2: 0.999
81
  Maximum gradient norm: 0.3
82
  LoRA dropout: 0.05
83
  Weight decay: 0.0
 
84
 
85
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/UWiE7z5tG8t_vdSFrb5WC.png)
86
 
 
63
  We used the [qlora](https://github.com/artidoro/qlora) package from artidoro.
64
  We trained with the following hyperparameters:
65
 
66
+ ```
67
  Per device evaluation batch size: 16
68
  Per device train batch size: 8
69
  LoRA (lora_r): 64
 
82
  Maximum gradient norm: 0.3
83
  LoRA dropout: 0.05
84
  Weight decay: 0.0
85
+ ```
86
 
87
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/UWiE7z5tG8t_vdSFrb5WC.png)
88