mT5-Small (MultiEURLEX Maltese)

This model is a fine-tuned version of google/mt5-small on the nlpaueb/multi_eurlex mt dataset. It achieves the following results on the test set:

Loss: 0.3648
F1: 0.3125

Intended uses & limitations

The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.

Training procedure

The model was fine-tuned using a customised script.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Use adafactor and the args are: No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 200.0
early_stopping_patience: 20

Training results

Training Loss	Epoch	Step	Validation Loss	F1
1.5559	1.0	548	0.4136	0.2994
0.424	2.0	1096	0.3933	0.2995
0.4078	3.0	1644	0.3755	0.3007
0.3848	4.0	2192	0.3663	0.2990
0.3714	5.0	2740	0.3571	0.2987
0.3599	6.0	3288	0.3452	0.3010
0.3436	7.0	3836	0.3237	0.3010
0.3358	8.0	4384	0.3232	0.3009
0.3292	9.0	4932	0.3145	0.2989
0.3196	10.0	5480	0.3101	0.2983
0.3045	11.0	6028	0.3111	0.2985
0.301	12.0	6576	0.3009	0.2941
0.3017	13.0	7124	0.3081	0.2911
0.3008	14.0	7672	0.3077	0.2952
0.2945	15.0	8220	0.3013	0.2982
0.2933	16.0	8768	0.2941	0.2940
0.2858	17.0	9316	0.3019	0.2918
0.2849	18.0	9864	0.2933	0.2965
0.2804	19.0	10412	0.2937	0.2918
0.2814	20.0	10960	0.2969	0.2960
0.2735	21.0	11508	0.2983	0.2925
0.2735	22.0	12056	0.3021	0.2986
0.2713	23.0	12604	0.2953	0.2956
0.2704	24.0	13152	0.3007	0.2959
0.2634	25.0	13700	0.3044	0.2986
0.2678	26.0	14248	0.2996	0.3005
0.2611	27.0	14796	0.2942	0.2961

Framework versions

Transformers 4.51.1
Pytorch 2.7.0+cu126
Datasets 3.2.0
Tokenizers 0.21.1

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

Citation

This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:

@inproceedings{micallef-borg-2025-melabenchv1,
    title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
    author = "Micallef, Kurt  and
      Borg, Claudia",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.1053/",
    doi = "10.18653/v1/2025.findings-acl.1053",
    pages = "20505--20527",
    ISBN = "979-8-89176-256-5",
}

MLRS
/

mt5-small_multieurlex-mlt

You need to agree to share your contact information to access this model

mT5-Small (MultiEURLEX Maltese)

Intended uses & limitations

Training procedure

Training hyperparameters

Training results

Framework versions

License

Citation

Model tree for MLRS/mt5-small_multieurlex-mlt

Dataset used to train MLRS/mt5-small_multieurlex-mlt

Collection including MLRS/mt5-small_multieurlex-mlt

mT5-Small

Evaluation results