jtatman commited on
Commit
2608579
·
verified ·
1 Parent(s): 2de4cf0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -69,18 +69,18 @@ This technique avoids training unnecessarily low-performng weights that can turn
69
 
70
  Axolotl was used for training and dataset tokenization.
71
 
72
- #### Preprocessing [optional]
73
 
74
  Dataset was formatted in ShareGpt format for the purposes of using with Axolotl, in conversational format.
75
 
76
  #### Training Hyperparameters
77
 
78
- lora_r: 64
79
- lora_alpha: 16
80
- lora_dropout: 0.05
81
- gradient_accumulation_steps: 4
82
- micro_batch_size: 1
83
- num_epochs: 3
84
- optimizer: adamw_bnb_8bit
85
- lr_scheduler: cosine
86
- learning_rate: 0.00025
 
69
 
70
  Axolotl was used for training and dataset tokenization.
71
 
72
+ #### Preprocessing
73
 
74
  Dataset was formatted in ShareGpt format for the purposes of using with Axolotl, in conversational format.
75
 
76
  #### Training Hyperparameters
77
 
78
+ - lora_r: 64
79
+ - lora_alpha: 16
80
+ - lora_dropout: 0.05
81
+ - gradient_accumulation_steps: 4
82
+ - micro_batch_size: 1
83
+ - num_epochs: 3
84
+ - optimizer: adamw_bnb_8bit
85
+ - lr_scheduler: cosine
86
+ - learning_rate: 0.00025