cloth_extracion_destilation_model / training_summary_gpt_LinearDiTStudent2_round1_iter4.txt
kornellewy's picture
gpt_LinearDiTStudent2_round1_iter4
8611184 verified
raw
history blame contribute delete
820 Bytes
Model: gpt_LinearDiTStudent2
Optimizer: adamw, LR: 2e-05
Best Val Loss: 1.0569, Test Loss: 1.0260
Epoch 1: Train Loss: 7146.0399, Val Loss: 5.1513
Epoch 2: Train Loss: 2.6326, Val Loss: 1.4842
Epoch 3: Train Loss: 1.4425, Val Loss: 1.2182
Epoch 4: Train Loss: 1.2817, Val Loss: 1.1668
Epoch 5: Train Loss: 1.1631, Val Loss: 1.1296
Epoch 6: Train Loss: 1.0963, Val Loss: 1.0863
Epoch 7: Train Loss: 1.2146, Val Loss: 1.0793
Epoch 8: Train Loss: 1.0522, Val Loss: 1.0756
Epoch 9: Train Loss: 1.0450, Val Loss: 1.0707
Epoch 10: Train Loss: 1.1127, Val Loss: 1.0760
Epoch 11: Train Loss: 1.0296, Val Loss: 1.0569
Epoch 12: Train Loss: 1.0260, Val Loss: 1.0654
Epoch 13: Train Loss: 1.0236, Val Loss: 1.0681
Epoch 14: Train Loss: 1.0223, Val Loss: 1.0661
Epoch 15: Train Loss: 1.0216, Val Loss: 1.0714
Final Test Loss: 1.0260