FatCat87 commited on
Commit
352b468
·
verified ·
1 Parent(s): 8d4f1f1

End of training

Browse files
Files changed (2) hide show
  1. README.md +16 -17
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -6,7 +6,7 @@ tags:
6
  - generated_from_trainer
7
  base_model: EleutherAI/pythia-1b
8
  model-index:
9
- - name: 7d2f4acd-ec41-4940-b17f-5ffb945afa24
10
  results: []
11
  ---
12
 
@@ -23,15 +23,15 @@ base_model: EleutherAI/pythia-1b
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
- - cf3a451ad5f57e3e_train_data.json
27
  ds_type: json
28
  format: custom
29
- path: cf3a451ad5f57e3e_train_data.json
30
  type:
31
  field: null
32
- field_input: original_code
33
- field_instruction: update_snippet
34
- field_output: final_code
35
  field_system: null
36
  format: null
37
  no_input_format: null
@@ -51,7 +51,7 @@ fsdp_config: null
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
- hub_model_id: FatCat87/7d2f4acd-ec41-4940-b17f-5ffb945afa24
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
@@ -83,9 +83,9 @@ val_set_size: 0.1
83
  wandb_entity: fatcat87-taopanda
84
  wandb_log_model: null
85
  wandb_mode: online
86
- wandb_name: 7d2f4acd-ec41-4940-b17f-5ffb945afa24
87
  wandb_project: subnet56
88
- wandb_runid: 7d2f4acd-ec41-4940-b17f-5ffb945afa24
89
  wandb_watch: null
90
  warmup_ratio: 0.05
91
  weight_decay: 0.0
@@ -95,12 +95,12 @@ xformers_attention: null
95
 
96
  </details><br>
97
 
98
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/xl5hydog)
99
- # 7d2f4acd-ec41-4940-b17f-5ffb945afa24
100
 
101
  This model is a fine-tuned version of [EleutherAI/pythia-1b](https://huggingface.co/EleutherAI/pythia-1b) on the None dataset.
102
  It achieves the following results on the evaluation set:
103
- - Loss: 0.0526
104
 
105
  ## Model description
106
 
@@ -130,17 +130,16 @@ The following hyperparameters were used during training:
130
  - total_eval_batch_size: 4
131
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
132
  - lr_scheduler_type: cosine
133
- - lr_scheduler_warmup_steps: 5
134
  - num_epochs: 1
135
 
136
  ### Training results
137
 
138
  | Training Loss | Epoch | Step | Validation Loss |
139
  |:-------------:|:------:|:----:|:---------------:|
140
- | 4.2604 | 0.0075 | 1 | 0.9479 |
141
- | 1.8401 | 0.2547 | 34 | 0.1403 |
142
- | 0.1249 | 0.5094 | 68 | 0.0728 |
143
- | 0.0707 | 0.7640 | 102 | 0.0526 |
144
 
145
 
146
  ### Framework versions
 
6
  - generated_from_trainer
7
  base_model: EleutherAI/pythia-1b
8
  model-index:
9
+ - name: 6a9b3cb4-d557-4213-9a78-090727567bd2
10
  results: []
11
  ---
12
 
 
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
+ - dd8e3e233f47c8af_train_data.json
27
  ds_type: json
28
  format: custom
29
+ path: dd8e3e233f47c8af_train_data.json
30
  type:
31
  field: null
32
+ field_input: null
33
+ field_instruction: name
34
+ field_output: text
35
  field_system: null
36
  format: null
37
  no_input_format: null
 
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
+ hub_model_id: FatCat87/6a9b3cb4-d557-4213-9a78-090727567bd2
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
 
83
  wandb_entity: fatcat87-taopanda
84
  wandb_log_model: null
85
  wandb_mode: online
86
+ wandb_name: 6a9b3cb4-d557-4213-9a78-090727567bd2
87
  wandb_project: subnet56
88
+ wandb_runid: 6a9b3cb4-d557-4213-9a78-090727567bd2
89
  wandb_watch: null
90
  warmup_ratio: 0.05
91
  weight_decay: 0.0
 
95
 
96
  </details><br>
97
 
98
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/49oaeot3)
99
+ # 6a9b3cb4-d557-4213-9a78-090727567bd2
100
 
101
  This model is a fine-tuned version of [EleutherAI/pythia-1b](https://huggingface.co/EleutherAI/pythia-1b) on the None dataset.
102
  It achieves the following results on the evaluation set:
103
+ - Loss: 3.3538
104
 
105
  ## Model description
106
 
 
130
  - total_eval_batch_size: 4
131
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
132
  - lr_scheduler_type: cosine
 
133
  - num_epochs: 1
134
 
135
  ### Training results
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
+ | 5.1782 | 0.1538 | 1 | 3.4070 |
140
+ | 5.0152 | 0.3077 | 2 | 3.3914 |
141
+ | 4.7583 | 0.6154 | 4 | 3.3633 |
142
+ | 4.7139 | 0.9231 | 6 | 3.3538 |
143
 
144
 
145
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b8db515f80c8038a6973cc43b2aa9105be8eb4b099165909ef384b9649c3c562
3
  size 67155978
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e82de70e6297e62ce12fa9f04e4eac40494205b285ebd3bba04e6417dcb9a641
3
  size 67155978