cuneytkaya
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -12,4 +12,42 @@ tags:
|
|
12 |
- fine-tuning
|
13 |
- chatbot
|
14 |
- llm
|
15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
- fine-tuning
|
13 |
- chatbot
|
14 |
- llm
|
15 |
+
license: cdla-sharing-1.0
|
16 |
+
---
|
17 |
+
# fintech-chatbot-t5
|
18 |
+
|
19 |
+
## Model Description
|
20 |
+
This model was fine-tuned using a [retail banking chatbot dataset](https://huggingface.co/datasets/bitext/Bitext-retail-banking-llm-chatbot-training-dataset/tree/main). It is based on the T5-small architecture and is capable of answering common banking-related queries like account balances, transaction details, card activations, and more.
|
21 |
+
|
22 |
+
The model has been trained to generate responses to banking-related customer queries and is suited for use in automated customer service systems or virtual assistants.
|
23 |
+
|
24 |
+
## Model Details
|
25 |
+
- **Model Type:** T5-small
|
26 |
+
- **Training Dataset:** [retail banking chatbot dataset](https://huggingface.co/datasets/bitext/Bitext-retail-banking-llm-chatbot-training-dataset/tree/main)
|
27 |
+
- **Tasks:** Natural Language Generation (NLG)
|
28 |
+
- **Languages Supported:** English
|
29 |
+
|
30 |
+
## Training Details
|
31 |
+
- **Number of Epochs:** 3
|
32 |
+
- **Training Loss:** 0.79
|
33 |
+
- **Evaluation Loss:** 0.46
|
34 |
+
- **Evaluation Metric:** Mean Squared Error
|
35 |
+
- **Batch Size:** 8
|
36 |
+
-
|
37 |
+
|
38 |
+
## How to Use the Model
|
39 |
+
You can load and use this model with the following code:
|
40 |
+
|
41 |
+
```python
|
42 |
+
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
43 |
+
|
44 |
+
tokenizer = T5Tokenizer.from_pretrained("cuneytkaya/fintech-chatbot-t5")
|
45 |
+
model = T5ForConditionalGeneration.from_pretrained("cuneytkaya/fintech-chatbot-t5")
|
46 |
+
|
47 |
+
input_text = "How can I activate my credit card?"
|
48 |
+
inputs = tokenizer.encode(input_text, return_tensors="pt")
|
49 |
+
outputs = model.generate(inputs)
|
50 |
+
|
51 |
+
print(tokenizer.decode(outputs[0]))
|
52 |
+
|
53 |
+
|