mT5-Small (Eur-Lex-Sum Maltese)

This model is a fine-tuned version of google/mt5-small on the dennlinger/eur-lex-sum maltese dataset. It achieves the following results on the test set:

Loss: 1.4531
Chrf:
- Score: 51.5481
- Char Order: 6
- Word Order: 0
- Beta: 2
Rouge:
- Rouge1: 0.5176
- Rouge2: 0.3497
- Rougel: 0.4249
- Rougelsum: 0.4247
Gen Len: 254.8511

Intended uses & limitations

The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.

Training procedure

The model was fine-tuned using a customised script.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use adafactor and the args are: No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 200.0
early_stopping_patience: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Chrf Score	Chrf Char Order	Chrf Beta	Rouge Rouge1	Rouge Rouge2	Rouge Rougel	Rouge Rougelsum	Gen Len
No log	1.0	30	2.2500	18.4506	6	2	0.1901	0.0868	0.1728	0.1729	255.0
No log	2.0	60	1.9908	40.0789	6	2	0.3872	0.2330	0.3379	0.3376	255.0
No log	3.0	90	1.7490	44.1723	6	2	0.4406	0.2759	0.3760	0.3758	255.0
No log	4.0	120	1.7205	49.4429	6	2	0.4885	0.3313	0.4081	0.4079	255.0
No log	5.0	150	1.5647	46.3055	6	2	0.4626	0.3068	0.3886	0.3886	255.0
No log	6.0	180	1.5374	46.3856	6	2	0.4756	0.3169	0.3986	0.3989	254.4439
No log	7.0	210	1.5262	47.2806	6	2	0.4706	0.3154	0.3959	0.3962	254.7807
No log	8.0	240	1.5142	48.5214	6	2	0.4916	0.3255	0.4121	0.4119	254.8449
No log	9.0	270	1.5271	49.4788	6	2	0.4982	0.3350	0.4211	0.4210	253.9893
No log	10.0	300	1.4995	48.3063	6	2	0.4832	0.3224	0.4127	0.4126	254.6684
No log	11.0	330	1.4947	52.1382	6	2	0.5213	0.3593	0.4416	0.4418	254.7914
No log	12.0	360	1.4704	49.9226	6	2	0.5004	0.3363	0.4236	0.4235	254.6203
No log	13.0	390	1.4933	51.6030	6	2	0.5199	0.3514	0.4317	0.4318	253.6257
No log	14.0	420	1.4640	47.8714	6	2	0.4840	0.3242	0.4094	0.4091	254.6952
No log	15.0	450	1.4726	51.2718	6	2	0.5188	0.3488	0.4354	0.4356	254.7166
No log	16.0	480	1.4667	49.9968	6	2	0.4989	0.3400	0.4287	0.4281	254.6203
1.7931	17.0	510	1.4624	50.7874	6	2	0.5123	0.3436	0.4345	0.4345	254.5508
1.7931	18.0	540	1.4775	50.5126	6	2	0.5121	0.3448	0.4273	0.4274	253.4439
1.7931	19.0	570	1.4762	50.7875	6	2	0.5194	0.3458	0.4311	0.4315	252.6631
1.7931	20.0	600	1.5157	52.2624	6	2	0.5187	0.3446	0.4324	0.4323	253.8289
1.7931	21.0	630	1.4982	51.8279	6	2	0.5161	0.3478	0.4368	0.4369	254.3529
1.7931	22.0	660	1.5087	51.9486	6	2	0.5174	0.3438	0.4315	0.4310	254.7807
1.7931	23.0	690	1.5355	51.9191	6	2	0.5224	0.3500	0.4301	0.4298	254.4439
1.7931	24.0	720	1.5061	50.0702	6	2	0.5002	0.3307	0.4152	0.4153	254.1765
1.7931	25.0	750	1.5271	50.3567	6	2	0.5046	0.3349	0.4216	0.4222	253.3102
1.7931	26.0	780	1.5378	50.8240	6	2	0.5089	0.3401	0.4210	0.4202	253.6471
1.7931	27.0	810	1.5414	50.8294	6	2	0.5118	0.3447	0.4282	0.4280	254.1176
1.7931	28.0	840	1.5774	52.6591	6	2	0.5283	0.3537	0.4390	0.4387	253.6684
1.7931	29.0	870	1.5661	52.3420	6	2	0.5292	0.3525	0.4376	0.4376	253.3262
1.7931	30.0	900	1.6079	51.8227	6	2	0.5212	0.3448	0.4313	0.4315	253.9626
1.7931	31.0	930	1.5900	51.9129	6	2	0.5245	0.3479	0.4327	0.4327	253.7380

Framework versions

Transformers 4.48.2
Pytorch 2.4.1+cu121
Datasets 3.2.0
Tokenizers 0.21.0

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

Citation

This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:

@inproceedings{micallef-borg-2025-melabenchv1,
    title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
    author = "Micallef, Kurt  and
      Borg, Claudia",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.1053/",
    doi = "10.18653/v1/2025.findings-acl.1053",
    pages = "20505--20527",
    ISBN = "979-8-89176-256-5",
}

MLRS
/

mt5-small_eurlexsum-mlt

You need to agree to share your contact information to access this model

mT5-Small (Eur-Lex-Sum Maltese)

Intended uses & limitations

Training procedure

Training hyperparameters

Training results

Framework versions

License

Citation

Model tree for MLRS/mt5-small_eurlexsum-mlt

Dataset used to train MLRS/mt5-small_eurlexsum-mlt

Collection including MLRS/mt5-small_eurlexsum-mlt

mT5-Small

Evaluation results