Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,26 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
language:
|
4 |
+
- xh
|
5 |
+
- zu
|
6 |
+
- nr
|
7 |
+
- ss
|
8 |
+
---
|
9 |
+
|
10 |
+
Usage:
|
11 |
+
|
12 |
+
1. For mask prediction
|
13 |
+
|
14 |
+
```
|
15 |
+
tokenizer = AutoTokenizer.from_pretrained("francois-meyer/nguni-xlmr-large")
|
16 |
+
model = XLMRobertaForMaskedLM.from_pretrained("francois-meyer/nguni-xlmr-large")
|
17 |
+
text = "A test <mask> for the nguni model." ## Replace with any sentence from the Nguni Languages with mask tokens.
|
18 |
+
inputs = tokenizer(text, return_tensors="pt")
|
19 |
+
with torch.no_grad():
|
20 |
+
logits = model(**inputs).logits
|
21 |
+
mask_token_index = (inputs.input_ids == tokenizer.mask_token_id)[0].nonzero(as_tuple=True)[0]
|
22 |
+
predicted_token_id = logits[0, mask_token_index].argmax(axis=-1)
|
23 |
+
print(tokenizer.decode(predicted_token_id))
|
24 |
+
```
|
25 |
+
|
26 |
+
2. For any other task, you might want to fine-tune the model in the same way you fine-tune a BERT/XLMR model.
|