question-complexity-classifier

🤖 Fine-tuned DistilBERT model for classifying question complexity (Simple vs Complex)

Model Details

Model Description

  • Architecture: DistilBERT base uncased
  • Fine-tuned on: Question Complexity Classification Dataset
  • Language: English
  • License: Apache 2.0
  • Max Sequence Length: 128 tokens

Uses

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="grahamaco/question-complexity-classifier",
    tokenizer="grahamaco/question-complexity-classifier",
    truncation=True,
    max_length=128  # Matches training config
)

result = classifier("Explain quantum computing in simple terms")
# Output example: {'label': 'COMPLEX', 'score': 0.97}

Training Details

  • Epochs: 5
  • Batch Size: 32 (global)
  • Learning Rate: 2e-5
  • Train/Val/Test Split: 80/10/10 (stratified)
  • Early Stopping: Patience of 2 epochs

Evaluation Results

Metric Value
Accuracy 0.92
F1 Score 0.91

Performance

Metric Value
Inference Latency 15.2ms (CPU)
Throughput 68.4 samples/sec (GPU)

Ethical Considerations

This model is intended for educational content classification only. Developers should:

  • Regularly audit performance across different question types
  • Monitor for unintended bias in complexity assessments
  • Provide human-review mechanisms for high-stakes classifications
  • Validate classifications against original context when used with RAG systems
Downloads last month
12
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train grahamaco/question-complexity-classifier