BERTu (Maltese Sentiment Analysis)

This model is a fine-tuned version of MLRS/BERTu on Sentiment Analysis. It achieves the following results on the test set:
- Loss: 0.5176
- F1: 0.8511
Intended uses & limitations
The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.
Training procedure
The model was fine-tuned using a customised script.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 32
- seed: 2
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: inverse_sqrt
- lr_scheduler_warmup_ratio: 0.005
- num_epochs: 200.0
- early_stopping_patience: 20
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
No log | 1.0 | 38 | 0.4389 | 0.7914 |
No log | 2.0 | 76 | 0.2928 | 0.9020 |
No log | 3.0 | 114 | 0.2375 | 0.8766 |
No log | 4.0 | 152 | 0.2501 | 0.9076 |
No log | 5.0 | 190 | 0.2855 | 0.9215 |
No log | 6.0 | 228 | 0.3583 | 0.8970 |
No log | 7.0 | 266 | 0.4191 | 0.8731 |
No log | 8.0 | 304 | 0.4540 | 0.8865 |
No log | 9.0 | 342 | 0.4227 | 0.8970 |
No log | 10.0 | 380 | 0.4526 | 0.8970 |
No log | 11.0 | 418 | 0.4572 | 0.8970 |
No log | 12.0 | 456 | 0.4483 | 0.8970 |
No log | 13.0 | 494 | 0.4574 | 0.8970 |
0.1024 | 14.0 | 532 | 0.4587 | 0.8970 |
0.1024 | 15.0 | 570 | 0.4676 | 0.8970 |
0.1024 | 16.0 | 608 | 0.4732 | 0.8970 |
0.1024 | 17.0 | 646 | 0.4772 | 0.8970 |
0.1024 | 18.0 | 684 | 0.4897 | 0.8849 |
0.1024 | 19.0 | 722 | 0.4938 | 0.8849 |
0.1024 | 20.0 | 760 | 0.4950 | 0.8849 |
0.1024 | 21.0 | 798 | 0.4947 | 0.8970 |
0.1024 | 22.0 | 836 | 0.4963 | 0.8970 |
0.1024 | 23.0 | 874 | 0.4993 | 0.8970 |
0.1024 | 24.0 | 912 | 0.5010 | 0.8970 |
0.1024 | 25.0 | 950 | 0.5030 | 0.8970 |
Framework versions
- Transformers 4.51.1
- Pytorch 2.7.0+cu126
- Datasets 3.2.0
- Tokenizers 0.21.1
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.
Citation
This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:
@inproceedings{micallef-borg-2025-melabenchv1,
title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
author = "Micallef, Kurt and
Borg, Claudia",
editor = "Che, Wanxiang and
Nabende, Joyce and
Shutova, Ekaterina and
Pilehvar, Mohammad Taher",
booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.findings-acl.1053/",
doi = "10.18653/v1/2025.findings-acl.1053",
pages = "20505--20527",
ISBN = "979-8-89176-256-5",
}
- Downloads last month
- -
Model tree for MLRS/BERTu_sentiment-mlt
Base model
MLRS/BERTuCollection including MLRS/BERTu_sentiment-mlt
Evaluation results
- Macro-averaged F1 on Maltese Sentiment AnalysisMELABench Leaderboard85.110