DistilBERT Command Classifier

A fine-tuned DistilBERT model for classifying user commands and questions with high accuracy, including handling of typos and variations.

Model Details

Model Description

This model is a fine-tuned version of distilbert-base-uncased specifically trained to classify various command types from user input. It's designed to handle natural language commands with typos, variations in phrasing, and different command intents.

  • Developed by: jhonacmarvik
  • Model type: Text Classification (Sequence Classification)
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from model: distilbert-base-uncased

Model Sources

Uses

Direct Use

This model can be directly used for:

  • Command intent classification - Identify what action the user wants to perform
  • Voice assistant routing - Route commands to appropriate handlers
  • Natural language interface control - Control systems through natural language
  • Question vs Command detection - Distinguish between questions and actionable commands

Example Usage

from transformers import pipeline

# Load the classifier
classifier = pipeline(
    "text-classification",
    model="jhonacmarvik/distilbert-command-classifier",
    top_k=3
)

# Single prediction
result = classifier("Turn on all work lights")
print(result)
# Output: [
#   {'label': 'turn_on_lights', 'score': 0.9234},
#   {'label': 'increase_brightness', 'score': 0.0543},
#   {'label': 'turn_off_lights', 'score': 0.0123}
# ]

# Batch prediction
commands = [
    "Turn on all work lights",
    "Decrease the brightness",
    "What's the temperature?"
]
results = classifier(commands)

Alternative Usage (Manual)

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained(
    "jhonacmarvik/distilbert-command-classifier"
)
tokenizer = AutoTokenizer.from_pretrained(
    "jhonacmarvik/distilbert-command-classifier"
)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

# Tokenize
text = "Turn on all work lights"
tokens = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
tokens = {k: v.to(device) for k, v in tokens.items()}

# Predict
with torch.no_grad():
    outputs = model(**tokens)
    probs = torch.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(probs, dim=-1)
    
print(f"Predicted: {model.config.id2label[predicted_class.item()]}")
print(f"Confidence: {probs[0][predicted_class].item():.4f}")

Downstream Use

Can be integrated into:

  • Smart home systems
  • Voice assistants
  • Chatbots and conversational AI
  • IoT device control interfaces
  • Natural language command parsers

Out-of-Scope Use

This model is NOT suitable for:

  • Commands outside its training vocabulary
  • Languages other than English
  • Sentiment analysis or emotion detection
  • General text classification tasks unrelated to commands
  • Safety-critical applications without human oversight

Bias, Risks, and Limitations

  • Vocabulary Limitation: Model is trained on specific command types and may not generalize to completely novel command categories
  • Typo Handling: While trained on variations with typos, extreme misspellings may reduce accuracy
  • Context Awareness: Model processes single utterances and doesn't maintain conversation context
  • Language: Only supports English language commands

Recommendations

  • Implement confidence thresholds (e.g., > 0.7) before executing commands
  • Provide fallback mechanisms for low-confidence predictions
  • Add human-in-the-loop for critical operations
  • Monitor model performance on production data and retrain periodically
  • Test thoroughly with your specific use case before deployment

Training Details

Training Data

  • Dataset: Custom dataset of command variations with intentional typos and paraphrases
  • Size: Multiple variations per command class
  • Format: CSV with text variations and corresponding labels
  • Split: 80% training, 20% validation (stratified)

Training Procedure

Preprocessing

  • Text converted to lowercase
  • Tokenization using DistilBERT tokenizer
  • Maximum sequence length: 128 tokens
  • Padding and truncation applied

Training Hyperparameters

  • Training regime: FP32
  • Optimizer: AdamW
  • Learning rate: 2e-5
  • Warmup steps: 100
  • Weight decay: 0.01
  • Batch size: 16 (per device)
  • Number of epochs: 10
  • Early stopping patience: 3 epochs
  • Evaluation strategy: Per epoch
  • Best model selection: Based on eval_loss

Hardware & Software

  • Framework: PyTorch + Transformers (Hugging Face)
  • Base model: distilbert-base-uncased
  • Hardware: GPU (CUDA-enabled) or CPU compatible

Evaluation

Metrics

The model was evaluated using:

  • Accuracy: Overall classification accuracy
  • F1 Score: Per-class and macro-averaged F1
  • Precision & Recall: Per-class metrics
  • Confusion Matrix: Visual representation of classification performance
  • ROC-AUC: Per-class ROC curves

Results

Model achieves high accuracy on the validation set with strong performance across all command classes. Detailed metrics are available in the training outputs.

Note: Specific metrics depend on your final training results. Update with actual values after training.

How to Get Started

Installation

pip install transformers torch

Quick Start

from transformers import pipeline

classifier = pipeline(
    "text-classification", 
    model="jhonacmarvik/distilbert-command-classifier"
)

result = classifier("Turn on the lights")
print(result)

Production Deployment

For production use with custom loading pattern:

import os
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

class CommandClassifier:
    def __init__(self):
        model_path = "jhonacmarvik/distilbert-command-classifier"
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.model = AutoModelForSequenceClassification.from_pretrained(model_path)
        self.model.to(self.device)
        self.model.eval()
    
    def predict(self, text: str, top_k: int = 3):
        tokens = self.tokenizer(text, return_tensors="pt", padding=True, truncation=True)
        tokens = {k: v.to(self.device) for k, v in tokens.items()}
        
        with torch.no_grad():
            logits = self.model(**tokens).logits
            probs = torch.softmax(logits, dim=-1)
            top_probs, top_indices = torch.topk(probs, k=top_k)
        
        results = []
        for prob, idx in zip(top_probs[0], top_indices[0]):
            results.append({
                "label": self.model.config.id2label[idx.item()],
                "score": float(prob.item())
            })
        return results

# Usage
classifier = CommandClassifier()
result = classifier.predict("Turn on lights", top_k=3)

Environmental Impact

Training a single model on standard GPU hardware has minimal environmental impact compared to large language models. This model uses a lightweight DistilBERT architecture which is significantly more efficient than full BERT models.

  • Hardware Type: GPU (CUDA-enabled)
  • Compute Region: [Your region]
  • Carbon Impact: Minimal due to efficient architecture

Technical Specifications

Model Architecture

  • Base Architecture: DistilBERT (6-layer, 768-hidden, 12-heads)
  • Parameters: ~66M parameters
  • Classification Head: Linear layer for multi-class classification
  • Dropout: 0.1 (default DistilBERT configuration)
  • Activation: GELU

Compute Infrastructure

Hardware

  • Compatible with CPU and GPU (CUDA)
  • Recommended: GPU with 4GB+ VRAM for faster inference
  • Works on CPU for low-volume applications

Software

  • Python 3.8+
  • PyTorch 2.0+
  • Transformers 4.30+
  • CUDA 11.0+ (for GPU acceleration)

Citation

If you use this model in your research or application, please cite:

@misc{distilbert-command-classifier,
  author = {jhonacmarvik},
  title = {DistilBERT Command Classifier},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/jhonacmarvik/distilbert-command-classifier}}
}

Model Card Authors

jhonacmarvik

Model Card Contact

For questions or issues, please open an issue in the model repository or contact through HuggingFace.

Downloads last month
20
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jhonacmarvik/distilbert-command-classifier

Finetuned
(10160)
this model