Model: gpt_SmallEfficientModel | |
Optimizer: adamw, LR: 2e-05 | |
Best Val Loss: 3.2904, Test Loss: 3.2941 | |
Epoch 1: Train Loss: 3.9155, Val Loss: 3.6207 | |
Epoch 2: Train Loss: 3.5420, Val Loss: 3.4396 | |
Epoch 3: Train Loss: 3.3867, Val Loss: 3.3419 | |
Epoch 4: Train Loss: 3.3300, Val Loss: 3.3160 | |
Epoch 5: Train Loss: 3.3102, Val Loss: 3.3062 | |
Epoch 6: Train Loss: 3.3017, Val Loss: 3.2987 | |
Epoch 7: Train Loss: 3.2973, Val Loss: 3.2957 | |
Epoch 8: Train Loss: 3.2950, Val Loss: 3.2904 | |
Epoch 9: Train Loss: 3.2936, Val Loss: 3.2943 | |
Epoch 10: Train Loss: 3.2931, Val Loss: 3.2985 | |
Final Test Loss: 3.2941 |