panigrah commited on
Commit
251df0e
·
1 Parent(s): ff949f2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: unknown
3
+ language:
4
+ - en
5
+ tags:
6
+ - wine
7
+ - ner
8
+ ---
9
+ ---
10
+ license: unknown
11
+ pipeline_tag: token-classification
12
+ tags:
13
+ - wine
14
+ - ner
15
+ ---
16
+
17
+ # Wineberto labels
18
+
19
+ Pretrained model on on wine labels only for named entity recognition that uses bert-base-uncased as the base model.
20
+
21
+ ## Model description
22
+
23
+
24
+ ## How to use
25
+
26
+ You can use this model directly for named entity recognition like so
27
+
28
+ ```python
29
+ >>> from transformers import pipeline
30
+ >>> ner = pipeline('ner', model='winberto-labels')
31
+ >>> tokens = ner("Heitz Cabernet Sauvignon California Napa Valley Napa US")
32
+ >>> for t in toks:
33
+ >>> print(f"{t['word']}: {t['entity_group']}: {t['score']:.5}")
34
+
35
+ heitz: producer: 0.99758
36
+ cabernet: wine: 0.92263
37
+ sauvignon: wine: 0.92472
38
+ california: region: 0.53502
39
+ napa valley: subregion: 0.79638
40
+ us: country: 0.93675
41
+ ```
42
+
43
+ ## Training data
44
+
45
+ The BERT model was trained on 50K wine labels derived from https://www.liv-ex.com/wwd/lwin/ and manually annotated to capture the following tokens
46
+
47
+ ```
48
+ "1": "B-classification",
49
+ "2": "B-country",
50
+ "3": "B-producer",
51
+ "4": "B-region",
52
+ "5": "B-subregion",
53
+ "6": "B-vintage",
54
+ "7": "B-wine"
55
+ ```
56
+
57
+ ## Training procedure
58
+ ```
59
+ model_id = 'bert-base-uncased'
60
+ arguments = TrainingArguments(
61
+ evaluation_strategy="epoch",
62
+ learning_rate=2e-5,
63
+ per_device_train_batch_size=8,
64
+ per_device_eval_batch_size=8,
65
+ num_train_epochs=5,
66
+ weight_decay=0.01,
67
+ )
68
+ ...
69
+ trainer.train()
70
+ ```