--- tags: - generated_from_trainer model-index: - name: flan-t5-base results: [] --- # flan-t5-base This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.3685 - Score: 35.0259 - Counts: [4617, 2627, 1550, 883] - Totals: [7288, 6288, 5297, 4382] - Precisions: [63.350713501646545, 41.777989821882954, 29.261846328110252, 20.150616157005935] - Bp: 0.991 - Sys Len: 7288 - Ref Len: 7354 - Gen Len: 10.556 - Learning Rate: 0.0004 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.005 - train_batch_size: 64 - eval_batch_size: 64 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | Score | Counts | Totals | Precisions | Bp | Sys Len | Ref Len | Gen Len | Rate | |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-----------------------:|:------------------------:|:--------------------------------------------------------------------------------:|:------:|:-------:|:-------:|:-------:|:------:| | 1.6959 | 0.55 | 4000 | 1.5776 | 30.6542 | [4414, 2368, 1345, 733] | [7417, 6417, 5426, 4519] | [59.511932047997846, 36.9019791179679, 24.78805750092149, 16.220402743969906] | 1.0 | 7417 | 7354 | 10.77 | 0.0045 | | 1.4378 | 1.11 | 8000 | 1.4527 | 32.3772 | [4526, 2538, 1483, 834] | [7567, 6567, 5576, 4666] | [59.81234306858729, 38.647784376427595, 26.596126255380202, 17.873981997428203] | 1.0 | 7567 | 7354 | 10.885 | 0.0035 | | 1.3904 | 1.66 | 12000 | 1.3961 | 33.8978 | [4558, 2559, 1494, 836] | [7286, 6286, 5295, 4383] | [62.55833104584134, 40.70951320394528, 28.21529745042493, 19.073693817020306] | 0.9907 | 7286 | 7354 | 10.569 | 0.0025 | | 1.3035 | 2.21 | 16000 | 1.3758 | 34.9471 | [4609, 2628, 1546, 880] | [7297, 6297, 5306, 4392] | [63.16294367548308, 41.73415912339209, 29.136826234451565, 20.036429872495447] | 0.9922 | 7297 | 7354 | 10.591 | 0.0015 | | 1.2994 | 2.77 | 20000 | 1.3685 | 35.0259 | [4617, 2627, 1550, 883] | [7288, 6288, 5297, 4382] | [63.350713501646545, 41.777989821882954, 29.261846328110252, 20.150616157005935] | 0.991 | 7288 | 7354 | 10.556 | 0.0004 | ### Framework versions - Transformers 4.29.2 - Pytorch 2.0.1 - Datasets 2.13.1 - Tokenizers 0.13.3