classla
/

xlm-r-slobertic

Inference Endpoints

Model card Files Files and versions Community

5roop commited on Nov 9, 2023

Commit

05139f2

·

1 Parent(s): eb087ab

Update README.md

Files changed (1) hide show

README.md +39 -0

README.md CHANGED Viewed

@@ -1,3 +1,42 @@
 ---
 license: cc-by-sa-4.0
 ---

 ---
 license: cc-by-sa-4.0
+language:
+- hr
+- sl
+- bs
+- sr
 ---
+# XLM-R-SloBertić
+This model was produced by pre-training [XLM-Roberta-large](https://huggingface.co/xlm-roberta-large) 48k steps on South Slavic languages.
+# Benchmarking
+Three tasks were chosen for model evaluation:
+* Named Entity Recognition (NER)
+* Sentiment regression
+* COPA (Choice of plausible alternatives)
+In all cases, this model was finetuned for specific downstream tasks.
+## NER
+(entry to be added soon)
+## Sentiment regression
+[ParlaSent dataset](https://huggingface.co/datasets/classla/ParlaSent) to evaluate sentiment regression for Bosnian, Croatian, and Serbian languages.
+The procedure is explained in greater detail in the dedicated [benchmarking repository](https://github.com/clarinsi/benchich/tree/main/sentiment).
+| system                                                                 | train               | test                     |   r^2 |
+|:-----------------------------------------------------------------------|:--------------------|:-------------------------|------:|
+| [xlm-r-parlasent](https://huggingface.co/classla/xlm-r-parlasent)      | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.615 |
+| [BERTić](https://huggingface.co/classla/bcms-bertic)                   | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.612 |
+| **XLM-R-SloBERTić **                                                   | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.607 |
+| XLM-Roberta-Large                                                      | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.605 |
+| XLM-R-BERTić                                                           | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.601 |
+| [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.537 |
+| XLM-Roberta-Base                                                       | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.500 |
+| dummy (mean)                                                           | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | -0.12 |
+## COPA
+(to be added soon)
+# Citation
+(to be added soon)
+# Authors
+* [Nikola Ljubešič](https://huggingface.co/nljubesi)