dchaplinsky commited on
Commit
5b794a2
·
1 Parent(s): e3ae526

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md CHANGED
@@ -1,3 +1,62 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - flair
4
+ - token-classification
5
+ - sequence-tagger-model
6
+ language: uk
7
+ model-index:
8
+ - name: flair-uk-pos
9
+ results:
10
+ - task:
11
+ name: POS
12
+ type: token-classification
13
+ metrics:
14
+ - name: POS F Score
15
+ type: f_score
16
+ value: 0.9793
17
+ widget:
18
+ - text: "Президент Володимир Зеленський пояснив, що наразі діалог із режимом Володимира путіна неможливий, адже агресор обрав курс на знищення українського народу. За словами Зеленського цей режим РФ виявляє неповагу до суверенітету і територіальної цілісності України."
19
  license: mit
20
  ---
21
+
22
+ # flair-uk-ner
23
+
24
+ ## Model description
25
+
26
+ **flair-uk-pos** is a Flair model that is ready to use for part-of-speech (upos) tagging. It is based on flair embeddings, that I've trained for Ukrainian language (available [here](https://huggingface.co/dchaplinsky/flair-uk-backward) and [here](https://huggingface.co/dchaplinsky/flair-uk-forward)) and has superior performance and a very **small size** (just 72mb!).
27
+
28
+
29
+ Results:
30
+ - F-score (micro) **0.9793**
31
+ - F-score (macro) **0.9275**
32
+ - Accuracy **0.9793**
33
+
34
+ | By class: | precision | recall | f1-score | support |
35
+ |--------------|-----------|--------|----------|---------|
36
+ | NOUN | 0.9857 | 0.9851 | 0.9854 | 4549 |
37
+ | PUNCT | 0.9984 | 1.0000 | 0.9992 | 3097 |
38
+ | ADJ | 0.9772 | 0.9852 | 0.9812 | 1959 |
39
+ | ADP | 0.9956 | 0.9968 | 0.9962 | 1584 |
40
+ | VERB | 0.9891 | 0.9910 | 0.9900 | 1552 |
41
+ | ADV | 0.9630 | 0.9118 | 0.9367 | 714 |
42
+ | CCONJ | 0.9685 | 0.9746 | 0.9715 | 630 |
43
+ | PROPN | 0.9279 | 0.9472 | 0.9375 | 625 |
44
+ | DET | 0.9729 | 0.9698 | 0.9713 | 629 |
45
+ | PRON | 0.9706 | 0.9631 | 0.9669 | 515 |
46
+ | PART | 0.9235 | 0.8693 | 0.8956 | 375 |
47
+ | NUM | 0.9722 | 0.9804 | 0.9763 | 357 |
48
+ | SCONJ | 0.8768 | 0.9577 | 0.9154 | 260 |
49
+ | AUX | 0.8906 | 0.9500 | 0.9194 | 120 |
50
+ | X | 0.9833 | 0.9593 | 0.9712 | 123 |
51
+ | SYM | 1.0000 | 0.7059 | 0.8276 | 17 |
52
+ | INTJ | 0.5556 | 0.5000 | 0.5263 | 10 |
53
+ | accuracy | | | 0.9793 | 17116 |
54
+ | macro avg | 0.9383 | 0.9204 | 0.9275 | 17116 |
55
+ | weighted avg | 0.9794 | 0.9793 | 0.9792 | 17116 |
56
+
57
+
58
+ The model was fine-tuned on the [Ukrainian (UD) corpus](https://universaldependencies.org/treebanks/uk_iu/index.html), released by the [non-profit organization Institute for Ukrainian](https://mova.institute).
59
+ Training code is also available [here](https://github.com/lang-uk/flair-pos).
60
+
61
+
62
+ Copyright: [Dmytro Chaplynskyi](https://twitter.com/dchaplinsky), [lang-uk project](https://lang.org.ua), 2022