File size: 1,413 Bytes
aee25f3
 
 
06416dd
 
 
 
 
aee25f3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
language:
- de
tags:
- flair
- sequence-tagger-model
- part-of-speech
- tweets
---

# Fine-grained POS Tagging of German Tweets

This Flair model was trained on the German Tweets dataset that is presented in the
[Fine-grained POS Tagging of German Tweets](https://pdfs.semanticscholar.org/82c9/90aa15e2e35de8294b4a721785da1ede20d0.pdf)
paper from Ines Rehbein.

It achieves an accuracy of 92.88% on the development set and an accuracy of **93.16%** on the final test dataset.

## Training

All training code is released in [this](https://github.com/stefan-it/flair-experiments/tree/master/pos-twitter-german) repository.

The model architecture uses the training strategy as proposed in the original [Flair](https://aclanthology.org/C18-1139/) paper:
German FastText embeddings and German Flair Embeddings are stacked and passed into a BiLSTM-CRF sequence labeler, achieving robost
SOTA results on PoS Tagging of German Tweets.

The full training log can be found [here](training.log).

## Demo: How to use in Flair

```python
from flair.data import Sentence
from flair.models import SequenceTagger

model = SequenceTagger.load('flair/de-pos-fine-grained')
sent = Sentence("@Sneeekas Ich nicht \o/", use_tokenizer=False)
model.predict(sent)

print(sent)
```

This yields the following output:

```text
Sentence[4]: "@Sneeekas Ich nicht \o/" → ["@Sneeekas"/ADDRESS, "Ich"/PPER, "nicht"/PTKNEG, "\o/"/EMO]
```