File size: 1,539 Bytes
c8a5732 391ccdf c8a5732 391ccdf c8a5732 391ccdf c8a5732 391ccdf c8a5732 5d71300 c8a5732 5c632ba c8a5732 391ccdf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
pipeline_tag: token-classification
tags:
- named-entity-recognition
- sequence-tagger-model
widget:
- text: Numele meu este Amadeus Wolfgang și locuiesc în Berlin
inference:
parameters:
aggregation_strategy: simple
grouped_entities: true
language:
- ro
---
xlm-roberta model trained on [ronec](https://github.com/dumitrescustefan/ronec) dataset, performing 95 f1-Macro on test set.
| Test metric | Results |
|------------------------|--------------------------|
| test_f1_mac_ronec | 0.9547659158706665 |
| test_loss_ronec | 0.16371206939220428 |
| test_prec_mac_ronec | 0.8663718700408936 |
| test_rec_mac_ronec | 0.8695588111877441 |
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("EvanD/xlm-roberta-base-romanian-ner-ronec")
ner_model = AutoModelForTokenClassification.from_pretrained("EvanD/xlm-roberta-base-romanian-ner-ronec")
nlp = pipeline("ner", model=ner_model, tokenizer=tokenizer, aggregation_strategy="simple")
example = "Numele meu este Amadeus Wolfgang și locuiesc în Berlin"
ner_results = nlp(example)
print(ner_results)
# [
# {
# 'entity_group': 'PER',
# 'score': 0.9966806,
# 'word': 'Amadeus Wolfgang',
# 'start': 16,
# 'end': 32
# },
# {'entity_group': 'GPE',
# 'score': 0.99694663,
# 'word': 'Berlin',
# 'start': 48,
# 'end': 54
# }
# ]
``` |