panigrah
/

wineberto-labels

Token Classification

Inference Endpoints

Model card Files Files and versions Community

panigrah commited on Nov 10, 2023

Commit

251df0e

·

1 Parent(s): ff949f2

Create README.md

Files changed (1) hide show

README.md +70 -0

README.md ADDED Viewed

	@@ -0,0 +1,70 @@

+---
+license: unknown
+language:
+- en
+tags:
+- wine
+- ner
+---
+---
+license: unknown
+pipeline_tag: token-classification
+tags:
+- wine
+- ner
+---
+# Wineberto labels
+Pretrained model on on wine labels only for named entity recognition that uses bert-base-uncased as the base model.
+## Model description
+## How to use
+You can use this model directly for named entity recognition like so
+```python
+>>> from transformers import pipeline
+>>> ner = pipeline('ner', model='winberto-labels')
+>>> tokens = ner("Heitz Cabernet Sauvignon California Napa Valley Napa US")
+>>> for t in toks:
+>>>    print(f"{t['word']}: {t['entity_group']}: {t['score']:.5}")
+heitz: producer: 0.99758
+cabernet: wine: 0.92263
+sauvignon: wine: 0.92472
+california: region: 0.53502
+napa valley: subregion: 0.79638
+us: country: 0.93675
+```
+## Training data
+The BERT model was trained on 50K wine labels derived from https://www.liv-ex.com/wwd/lwin/ and manually annotated to capture the following tokens
+```
+"1": "B-classification",
+"2": "B-country",
+"3": "B-producer",
+"4": "B-region",
+"5": "B-subregion",
+"6": "B-vintage",
+"7": "B-wine"
+```
+## Training procedure
+```
+model_id = 'bert-base-uncased'
+arguments = TrainingArguments(
+    evaluation_strategy="epoch",
+    learning_rate=2e-5,
+    per_device_train_batch_size=8,
+    per_device_eval_batch_size=8,
+    num_train_epochs=5,
+    weight_decay=0.01,
+)
+...
+trainer.train()
+```