---
tags:
- generated_from_trainer
model-index:
- name: flan-t5-base
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# flan-t5-base

This model was trained from scratch on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.3685
- Score: 35.0259
- Counts: [4617, 2627, 1550, 883]
- Totals: [7288, 6288, 5297, 4382]
- Precisions: [63.350713501646545, 41.777989821882954, 29.261846328110252, 20.150616157005935]
- Bp: 0.991
- Sys Len: 7288
- Ref Len: 7354
- Gen Len: 10.556
- Learning Rate: 0.0004

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.005
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Score   | Counts                  | Totals                   | Precisions                                                                       | Bp     | Sys Len | Ref Len | Gen Len | Rate   |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-----------------------:|:------------------------:|:--------------------------------------------------------------------------------:|:------:|:-------:|:-------:|:-------:|:------:|
| 1.6959        | 0.55  | 4000  | 1.5776          | 30.6542 | [4414, 2368, 1345, 733] | [7417, 6417, 5426, 4519] | [59.511932047997846, 36.9019791179679, 24.78805750092149, 16.220402743969906]    | 1.0    | 7417    | 7354    | 10.77   | 0.0045 |
| 1.4378        | 1.11  | 8000  | 1.4527          | 32.3772 | [4526, 2538, 1483, 834] | [7567, 6567, 5576, 4666] | [59.81234306858729, 38.647784376427595, 26.596126255380202, 17.873981997428203]  | 1.0    | 7567    | 7354    | 10.885  | 0.0035 |
| 1.3904        | 1.66  | 12000 | 1.3961          | 33.8978 | [4558, 2559, 1494, 836] | [7286, 6286, 5295, 4383] | [62.55833104584134, 40.70951320394528, 28.21529745042493, 19.073693817020306]    | 0.9907 | 7286    | 7354    | 10.569  | 0.0025 |
| 1.3035        | 2.21  | 16000 | 1.3758          | 34.9471 | [4609, 2628, 1546, 880] | [7297, 6297, 5306, 4392] | [63.16294367548308, 41.73415912339209, 29.136826234451565, 20.036429872495447]   | 0.9922 | 7297    | 7354    | 10.591  | 0.0015 |
| 1.2994        | 2.77  | 20000 | 1.3685          | 35.0259 | [4617, 2627, 1550, 883] | [7288, 6288, 5297, 4382] | [63.350713501646545, 41.777989821882954, 29.261846328110252, 20.150616157005935] | 0.991  | 7288    | 7354    | 10.556  | 0.0004 |


### Framework versions

- Transformers 4.29.2
- Pytorch 2.0.1
- Datasets 2.13.1
- Tokenizers 0.13.3