--- license: mit base_model: coppercitylabs/uzbert-base-uncased tags: - generated_from_trainer metrics: - precision - recall - f1 - accuracy model-index: - name: uzpostagger-cyrillic-3 results: [] --- # uzpostagger-cyrillic-3 This model is a fine-tuned version of [coppercitylabs/uzbert-base-uncased](https://huggingface.co/coppercitylabs/uzbert-base-uncased) on [uzbekpos](https://huggingface.co/datasets/latofat/uzbekpos) dataset. It achieves the following results on the evaluation set: - Loss: 0.2715 - Precision: 0.8763 - Recall: 0.8699 - F1: 0.8731 - Accuracy: 0.9219 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 5 ### Training results | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:| | No log | 1.0 | 25 | 0.8765 | 0.6558 | 0.5477 | 0.5969 | 0.7485 | | No log | 2.0 | 50 | 0.4086 | 0.8496 | 0.8237 | 0.8364 | 0.9004 | | No log | 3.0 | 75 | 0.3133 | 0.8615 | 0.8552 | 0.8583 | 0.9142 | | No log | 4.0 | 100 | 0.2806 | 0.8730 | 0.8657 | 0.8693 | 0.9193 | | No log | 5.0 | 125 | 0.2715 | 0.8763 | 0.8699 | 0.8731 | 0.9219 | ### Framework versions - Transformers 4.32.1 - Pytorch 2.2.0 - Datasets 2.17.1 - Tokenizers 0.13.3 ## Citation Information ``` @inproceedings{bobojonova-etal-2025-bbpos, title = "{BBPOS}: {BERT}-based Part-of-Speech Tagging for {U}zbek", author = "Bobojonova, Latofat and Akhundjanova, Arofat and Ostheimer, Phil Sidney and Fellenz, Sophie", editor = "Hettiarachchi, Hansi and Ranasinghe, Tharindu and Rayson, Paul and Mitkov, Ruslan and Gaber, Mohamed and Premasiri, Damith and Tan, Fiona Anting and Uyangodage, Lasitha", booktitle = "Proceedings of the First Workshop on Language Models for Low-Resource Languages", month = jan, year = "2025", address = "Abu Dhabi, United Arab Emirates", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.loreslm-1.23/", pages = "287--293", abstract = "This paper advances NLP research for the low-resource Uzbek language by evaluating two previously untested monolingual Uzbek BERT models on the part-of-speech (POS) tagging task and introducing the first publicly available UPOS-tagged benchmark dataset for Uzbek. Our fine-tuned models achieve 91{\%} average accuracy, outperforming the baseline multi-lingual BERT as well as the rule-based tagger. Notably, these models capture intermediate POS changes through affixes and demonstrate context sensitivity, unlike existing rule-based taggers." } ```