Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,34 @@
|
|
1 |
---
|
2 |
license: openrail++
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: openrail++
|
3 |
+
language:
|
4 |
+
- uk
|
5 |
+
widget:
|
6 |
+
- text: "Ти неймовірна!"
|
7 |
---
|
8 |
+
|
9 |
+
## Binary toxicity classifier for Ukrainian
|
10 |
+
|
11 |
+
This is the fine-tuned on the downstream task ["xlm-roberta-large"](https://huggingface.co/xlm-roberta-large) instance.
|
12 |
+
|
13 |
+
The evaluation metrics for binary toxicity classification are:
|
14 |
+
|
15 |
+
**Precision**: 0.9468
|
16 |
+
**Recall**: 0.9465
|
17 |
+
**F1**: 0.9465
|
18 |
+
|
19 |
+
The training and evaluation data will be clarified later.
|
20 |
+
|
21 |
+
## How to use
|
22 |
+
```python
|
23 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
24 |
+
|
25 |
+
# load tokenizer and model weights
|
26 |
+
tokenizer = AutoTokenizer.from_pretrained('dardem/xlm-roberta-large-uk-toxicity')
|
27 |
+
model = AutoModelForSequenceClassification.from_pretrained('dardem/xlm-roberta-large-uk-toxicity')
|
28 |
+
|
29 |
+
# prepare the input
|
30 |
+
batch = tokenizer.encode('Ти неймовірна!', return_tensors='pt')
|
31 |
+
|
32 |
+
# inference
|
33 |
+
model(batch)
|
34 |
+
```
|