jeremyvictor
/

mt5-large-gramatika1500k

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-large-gramatika1500k

This model is a fine-tuned version of google/mt5-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0471
Rouge1: 75.56
Rouge2: 72.1272
Rougel: 75.5131
Rougelsum: 75.5134
Gen Len: 18.4427

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adafactor
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.1109	1.33	100000	0.0567	75.0228	71.1923	74.9653	74.9652	18.4441
0.0572	2.67	200000	0.0494	75.3858	71.8285	75.3356	75.334	18.4427
0.0431	4.0	300000	0.0471	75.56	72.1272	75.5131	75.5134	18.4427
0.0332	5.33	400000	0.0486	75.6167	72.2424	75.5734	75.5726	18.4424
0.0277	6.67	500000	0.0490	75.6749	72.3462	75.6327	75.6317	18.4428
0.0236	8.0	600000	0.0501	75.6924	72.3891	75.6502	75.6508	18.4430
0.0202	9.34	700000	0.0525	75.7134	72.4174	75.6724	75.6714	18.4427

Framework versions

Transformers 4.31.0
Pytorch 1.11.0a0+b6df043
Datasets 2.12.0
Tokenizers 0.13.3

Downloads last month: 1

Inference Providers NEW

Text2Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for jeremyvictor/mt5-large-gramatika1500k

Base model

google/mt5-large

Finetuned

(41)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard