SystemAdmin123 commited on
Commit
7a7c676
·
verified ·
1 Parent(s): 1784a60

End of training

Browse files
Files changed (1) hide show
  1. README.md +12 -5
README.md CHANGED
@@ -36,7 +36,7 @@ datasets:
36
  system_prompt: ''
37
  device_map: auto
38
  eval_sample_packing: false
39
- eval_steps: 200
40
  flash_attention: true
41
  gradient_checkpointing: true
42
  group_by_length: true
@@ -54,7 +54,7 @@ output_dir: /root/.sn56/axolotl/tmp/tiny-random-LlamaForCausalLM
54
  pad_to_sequence_len: true
55
  resize_token_embeddings_to_32x: false
56
  sample_packing: true
57
- save_steps: 200
58
  save_total_limit: 1
59
  sequence_len: 2048
60
  tokenizer_type: LlamaTokenizerFast
@@ -78,6 +78,8 @@ warmup_ratio: 0.05
78
  # tiny-random-LlamaForCausalLM
79
 
80
  This model is a fine-tuned version of [trl-internal-testing/tiny-random-LlamaForCausalLM](https://huggingface.co/trl-internal-testing/tiny-random-LlamaForCausalLM) on the argilla/databricks-dolly-15k-curated-en dataset.
 
 
81
 
82
  ## Model description
83
 
@@ -111,9 +113,14 @@ The following hyperparameters were used during training:
111
 
112
  ### Training results
113
 
114
- | Training Loss | Epoch | Step | Validation Loss |
115
- |:-------------:|:------:|:----:|:---------------:|
116
- | No log | 0.1667 | 1 | 10.3764 |
 
 
 
 
 
117
 
118
 
119
  ### Framework versions
 
36
  system_prompt: ''
37
  device_map: auto
38
  eval_sample_packing: false
39
+ eval_steps: 20
40
  flash_attention: true
41
  gradient_checkpointing: true
42
  group_by_length: true
 
54
  pad_to_sequence_len: true
55
  resize_token_embeddings_to_32x: false
56
  sample_packing: true
57
+ save_steps: 20
58
  save_total_limit: 1
59
  sequence_len: 2048
60
  tokenizer_type: LlamaTokenizerFast
 
78
  # tiny-random-LlamaForCausalLM
79
 
80
  This model is a fine-tuned version of [trl-internal-testing/tiny-random-LlamaForCausalLM](https://huggingface.co/trl-internal-testing/tiny-random-LlamaForCausalLM) on the argilla/databricks-dolly-15k-curated-en dataset.
81
+ It achieves the following results on the evaluation set:
82
+ - Loss: 10.1817
83
 
84
  ## Model description
85
 
 
113
 
114
  ### Training results
115
 
116
+ | Training Loss | Epoch | Step | Validation Loss |
117
+ |:-------------:|:-------:|:----:|:---------------:|
118
+ | No log | 0.1667 | 1 | 10.3764 |
119
+ | 10.3632 | 3.3333 | 20 | 10.3538 |
120
+ | 10.3073 | 6.6667 | 40 | 10.2840 |
121
+ | 10.2203 | 10.0 | 60 | 10.2082 |
122
+ | 10.1812 | 13.3333 | 80 | 10.1828 |
123
+ | 10.1767 | 16.6667 | 100 | 10.1817 |
124
 
125
 
126
  ### Framework versions