Fill-Mask
Transformers
PyTorch
xlm-roberta
Inference Endpoints
5roop commited on
Commit
05139f2
·
1 Parent(s): eb087ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md CHANGED
@@ -1,3 +1,42 @@
1
  ---
2
  license: cc-by-sa-4.0
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-sa-4.0
3
+ language:
4
+ - hr
5
+ - sl
6
+ - bs
7
+ - sr
8
  ---
9
+ # XLM-R-SloBertić
10
+
11
+ This model was produced by pre-training [XLM-Roberta-large](https://huggingface.co/xlm-roberta-large) 48k steps on South Slavic languages.
12
+
13
+ # Benchmarking
14
+ Three tasks were chosen for model evaluation:
15
+ * Named Entity Recognition (NER)
16
+ * Sentiment regression
17
+ * COPA (Choice of plausible alternatives)
18
+ In all cases, this model was finetuned for specific downstream tasks.
19
+ ## NER
20
+ (entry to be added soon)
21
+ ## Sentiment regression
22
+
23
+ [ParlaSent dataset](https://huggingface.co/datasets/classla/ParlaSent) to evaluate sentiment regression for Bosnian, Croatian, and Serbian languages.
24
+ The procedure is explained in greater detail in the dedicated [benchmarking repository](https://github.com/clarinsi/benchich/tree/main/sentiment).
25
+
26
+ | system | train | test | r^2 |
27
+ |:-----------------------------------------------------------------------|:--------------------|:-------------------------|------:|
28
+ | [xlm-r-parlasent](https://huggingface.co/classla/xlm-r-parlasent) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.615 |
29
+ | [BERTić](https://huggingface.co/classla/bcms-bertic) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.612 |
30
+ | **XLM-R-SloBERTić ** | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.607 |
31
+ | XLM-Roberta-Large | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.605 |
32
+ | XLM-R-BERTić | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.601 |
33
+ | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.537 |
34
+ | XLM-Roberta-Base | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.500 |
35
+ | dummy (mean) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | -0.12 |
36
+ ## COPA
37
+ (to be added soon)
38
+
39
+ # Citation
40
+ (to be added soon)
41
+ # Authors
42
+ * [Nikola Ljubešič](https://huggingface.co/nljubesi)