# Model Card for BERT Slot Filling Model
This BERT-based model is designed
for slot filling tasks in natural language sentences,
ideal for extracting specific information in applications like chatbots and virtual assistants.
For example:
input: Transfer $500 from checking to student savings
output: transfer **[$500:B-amount]** from **[checking:B-account-from]** to **[student:B-account-to]** **[savings:I-account-to]**
The model was trained with dataset https://github.com/SunLemuria/JointBERT-Tensorflow1/blob/master/data/
Please make yourself familiar with BERT: BERT for Joint Intent Classification and Slot Filling https://arxiv.org/pdf/1902.10909.pdf
## Model Details
Tag |
Definition |
B-account-from |
Start of the source account in a transaction. |
I-account-from |
Complement of the source account in a transaction. |
B-account-to |
Start of the target account in a transaction. |
I-account-to |
Complement of the target account in a transaction. |
B-bill_type |
Start of the type of bill or service. |
I-bill_type |
Complement of the type of bill or service. |
B-transaction-from |
Start of the origin of a transaction or fraud. |
I-transaction-from |
Complement of the origin of a transaction or fraud. |
B-transaction-to |
Start of the destination or end of a transaction or fraud. |
I-transaction-to |
Complement of the destination of a transaction or fraud. |
B-amount |
Start of a specified amount of money. |
I-amount |
Complement of a specified amount of money. |
B-timeRange |
Start of a specific time range or date. |
I-timeRange |
Complement of a specific time range or date. |
### Model Description
- **Developed by:** Andy González
- **Model type:** Token Classification
- **Language(s) (NLP):** English
- **Finetuned from model [optional]:** bert-base-uncased
## How to Get Started with the Model
```python
!pip install torch transformers
import os
import requests
import torch
from transformers import BertForTokenClassification, BertTokenizerFast
# URL y archivo para los slots
slots_url = 'https://huggingface.co/andgonzalez/bert-uncased-slot-filling/raw/main/slots.txt'
slots_file = 'slots.txt'
device = "cpu"
# Descargar y guardar los slots si no existen
if not os.path.exists(slots_file):
response = requests.get(slots_url)
response.raise_for_status()
with open(slots_file, 'w') as file:
file.write(response.text)
# Leer los slots
with open(slots_file, 'r') as file:
slot_labels = file.read().splitlines()
# Cargar el tokenizador y el modelo
tokenizer = BertTokenizerFast.from_pretrained('andgonzalez/bert-uncased-slot-filling')
model = BertForTokenClassification.from_pretrained('andgonzalez/bert-uncased-slot-filling')
# Ejemplo
sentence = "Transfer $500 from checking to student savings"
inputs = tokenizer(sentence, truncation=True, padding='max_length', max_length=20, return_tensors="pt")
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.no_grad():
model.eval()
outputs = model(**inputs)
# Procesar los logits para obtener predicciones
logits = outputs.logits
predictions = torch.argmax(logits, dim=2).squeeze().cpu().numpy()
words = tokenizer.convert_ids_to_tokens(inputs["input_ids"].squeeze().cpu().numpy())
# Inicializar skip_next
skip_next = False
# Formatear la oracion
formatted_sentence = []
for i, (word, pred) in enumerate(zip(words, predictions)):
if word not in ['[PAD]', '[SEP]', '[CLS]']:
label = slot_labels[pred]
if word == "$" and i + 1 < len(words) and words[i + 1].replace("##", "").isdigit():
next_word = words[i + 1].replace("##", "")
combined_word = word + next_word
formatted_word = f'[{combined_word}:{label}]'
formatted_sentence.append(formatted_word)
skip_next = True
elif skip_next:
skip_next = False
continue
elif not word.startswith("##"):
if label != 'O':
formatted_word = f'[{word}:{label}]'
else:
formatted_word = word
formatted_sentence.append(formatted_word)
formatted_sentence = ' '.join(formatted_sentence)
print(formatted_sentence)
```
## Training Details
#### Metrics
- Metrics used: Precision, Recall, F1-Score
### Results
- **Epoch 1:** Loss: 1.3253, Precision: 0.5862, Recall: 0.5758, F1-Score: 0.5633
- **Epoch 2:** Loss: 0.3507, Precision: 0.7491, Recall: 0.7476, F1-Score: 0.7374
- **Epoch 3:** Loss: 0.2156, Precision: 0.8180, Recall: 0.8138, F1-Score: 0.8007
- **Epoch 4:** Loss: 0.1593, Precision: 0.8252, Recall: 0.8274, F1-Score: 0.8173
- **Epoch 5:** Loss: 0.1236, Precision: 0.8613, Recall: 0.8549, F1-Score: 0.8466
- **Epoch 6:** Loss: 0.0961, Precision: 0.8839, Recall: 0.8810, F1-Score: 0.8786
- **Epoch 7:** Loss: 0.0787, Precision: 0.8795, Recall: 0.8917, F1-Score: 0.8808
- **Epoch 8:** Loss: 0.0644, Precision: 0.8956, Recall: 0.8958, F1-Score: 0.8911
- **Epoch 9:** Loss: 0.0542, Precision: 0.8889, Recall: 0.9012, F1-Score: 0.8913
- **Epoch 10:** Loss: 0.0468, Precision: 0.8980, Recall: 0.9007, F1-Score: 0.8935
- **Best Model:** Epoch 8, Test Loss: 0.1588
### Plots
