lewtun
/

quantized-distilbert-banking77

Text Classification

Model card Files Files and versions Community

Create README.md

#1

by lewtun HF staff - opened Jun 8, 2022

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +60 -0

README.md ADDED Viewed

	@@ -0,0 +1,60 @@

+---
+tags:
+- optimum
+datasets:
+- banking77
+metrics:
+- accuracy
+model-index:
+- name: quantized-distilbert-banking77
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: banking77
+      type: banking77
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.9244
+---
+# Quantized-distilbert-banking77
+This model is a dynamically quantized version of [optimum/distilbert-base-uncased-finetuned-banking77](https://huggingface.co/optimum/distilbert-base-uncased-finetuned-banking77) on the `banking77` dataset.
+The model was created using the [dynamic-quantization](https://github.com/huggingface/workshops/tree/main/mlops-world) notebook from a workshop presented at MLOps World 2022.
+It achieves the following results on the evaluation set:
+**Accuracy**
+- Vanilla model: 92.5%
+- Quantized model: 92.44%
+> The quantized model achieves 99.72% accuracy of the fp32 model
+**Latency**
+Payload sequence length: 128
+Instance type: AWS c6i.xlarge
+| latency | vanilla transformers | quantized optimum model | improvement |
+|---------|----------------------|-------------------------|-------------|
+| p95     | 63.24ms              | 37.06ms                 | 1.71x       |
+| avg     | 62.87ms              | 37.93ms                 | 1.66x       |
+## How to use
+```python
+from optimum.onnxruntime import ORTModelForSequenceClassification
+from transformers import pipeline, AutoTokenizer
+model = ORTModelForSequenceClassification.from_pretrained("lewtun/quantized-distilbert-banking77")
+tokenizer = AutoTokenizer.from_pretrained("lewtun/quantized-distilbert-banking77")
+classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
+classifier("What is the exchange rate like on this app?")
+```