T5 Product Category & Subcategory Classifier

This model is fine-tuned on T5-base for product category and subcategory classification.

Model Description

  • Model Type: T5 (Text-to-Text Transfer Transformer)
  • Language: English
  • Task: Product Classification
  • Training Data: 10,172 categorized products
  • Input Format: "Predict the product category and subcategory in the following format: 'Category: | Subcategory: '. Product: {product_name}"
  • Output Format: "Category: {category} | Subcategory: {subcategory}"

Usage

from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("{repo_id}")
tokenizer = T5Tokenizer.from_pretrained("{repo_id}")

def predict(text):
    prompt = f"Predict the product category and subcategory in the following format: 'Category: <CATEGORY> | Subcategory: <SUBCATEGORY>'. Product: {text}"
    inputs = tokenizer(prompt, return_tensors="pt", max_length=128, truncation=True)
    
    outputs = model.generate(**inputs, max_length=32, num_beams=4)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example
result = predict("Pantene Suave & Liso Shampoo")
print(result)

Training Details

  • Base Model: t5-base
  • Training Type: Fine-tuning
  • Epochs: 5
  • Batch Size: 8
  • Learning Rate: 3e-5
  • Weight Decay: 0.01

Limitations

  • The model works best with product names in English
  • Performance may vary for products outside the training categories
  • Requires clear and specific product descriptions
Downloads last month
85
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.