Plot Arc Classifier - DeBERTa Small

A fine-tuned DeBERTa-v3-small model for classifying character plot arc types in narrative text.

Model Details

Model Description

This model classifies character descriptions into four plot arc categories:

NONE (0): No discernible character development or plot arc
INTERNAL (1): Character growth driven by internal conflict/psychology
EXTERNAL (2): Character arc driven by external events/missions
BOTH (3): Character arc with both internal conflict and external drivers

Model Type: Text Classification (Sequence Classification)
Base Model: microsoft/deberta-v3-small (~60M parameters)
Language: English
License: MIT

Model Architecture

Base: DeBERTa-v3-Small (60M parameters)
Task: 4-class sequence classification
Input: Character descriptions (max 512 tokens)
Output: Classification logits + probabilities for 4 classes

Training Data

Dataset Statistics

Total Examples: 101,348
Training Split: 91,213 examples (90%)
Validation Split: 10,135 examples (10%)
Perfect Class Balance: 25,337 examples per class

Data Sources

Systematic scanning of 1.8M+ character descriptions
LLM validation using Llama-3.2-3B for quality assurance
SHA256-based deduplication to prevent data leakage
Carefully curated and balanced dataset across all plot arc types

Class Distribution

Class	Count	Percentage
NONE	25,337	25%
INTERNAL	25,337	25%
EXTERNAL	25,337	25%
BOTH	25,337	25%

Performance

Key Metrics

Accuracy: 0.7286
F1 (Weighted): 0.7283
F1 (Macro): 0.7275

Per-Class Performance

Class	Precision	Recall	F1-Score	Support
NONE	0.697	0.613	0.653	2,495
INTERNAL	0.677	0.683	0.680	2,571
EXTERNAL	0.892	0.882	0.887	2,568
BOTH	0.652	0.732	0.690	2,501

Training Details

Training Time: 9.7 hours on Apple Silicon MPS
Final Training Loss: 0.635
Epochs: 3.86 (early stopping)
Batch Size: 16 (effective: 32 with gradient accumulation)
Learning Rate: 2e-5 with warmup
Optimizer: AdamW with weight decay (0.01)

Confusion Matrix

Usage

Basic Usage

from transformers import DebertaV2Tokenizer, DebertaV2ForSequenceClassification
import torch

# Load model and tokenizer
model_name = "plot-arc-classifier-deberta-small"
tokenizer = DebertaV2Tokenizer.from_pretrained(model_name)
model = DebertaV2ForSequenceClassification.from_pretrained(model_name)

# Example text
text = "Sir Galahad embarks on a perilous quest to retrieve the stolen Crown of Ages."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
    probabilities = torch.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(probabilities, dim=-1)

# Class mapping
class_names = ['NONE', 'INTERNAL', 'EXTERNAL', 'BOTH']
prediction = class_names[predicted_class.item()]
confidence = probabilities[0][predicted_class].item()

print(f"Predicted class: {prediction} (confidence: {confidence:.3f})")

Pipeline Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification", 
    model="plot-arc-classifier-deberta-small",
    return_all_scores=True
)

result = classifier("Captain Torres must infiltrate enemy lines while battling his own cowardice.")
print(result)

Example Classifications

Class	Type	Example	Prediction	Confidence
NONE	Simple	"Margaret runs the village bakery, making fresh bread every morning at 5 AM for the past thirty years."	NONE ✅	0.997
NONE	Nuanced	"Dr. Harrison performs routine medical check-ups with methodical precision, maintaining professional distance while patients share their deepest fears about mortality."	NONE ⚠️	0.581
INTERNAL	Simple	"Emma struggles with overwhelming anxiety after her father's harsh criticism, questioning her self-worth and abilities."	INTERNAL ✅	0.983
INTERNAL	Nuanced	"The renowned pianist Clara finds herself paralyzed by perfectionism, her childhood trauma surfacing as she prepares for the performance that could define her legacy."	INTERNAL ✅	0.733
EXTERNAL	Simple	"Knight Roderick embarks on a dangerous quest to retrieve the stolen crown from the dragon's lair."	EXTERNAL ✅	0.717
EXTERNAL	Nuanced	"Master thief Elias infiltrates the heavily guarded fortress, disabling security systems and evading patrol routes, each obstacle requiring new techniques and tools to reach the vault."	EXTERNAL ✅	0.711
BOTH	Simple	"Sarah must rescue her kidnapped daughter from the terrorist compound while confronting her own paralyzing guilt about being an absent mother."	BOTH ⚠️	0.578
BOTH	Nuanced	"Archaeologist Sophia discovers an ancient artifact that could rewrite history, but must confront her own ethical boundaries and childhood abandonment issues as powerful forces try to silence her."	BOTH ✅	0.926

Results: 8/8 correct predictions (100% accuracy)

Limitations

Domain: Optimized for character descriptions in narrative fiction
Length: Maximum 512 tokens (longer texts are truncated)
Language: English only
Context: Works best with character-focused descriptions rather than plot summaries
Ambiguity: Some edge cases may be inherently ambiguous between INTERNAL/BOTH

Ethical Considerations

Bias: Training data may contain genre/cultural biases toward certain character archetypes
Interpretation: Classifications reflect Western narrative theory; other storytelling traditions may not map perfectly
Automation: Should complement, not replace, human literary analysis

Citation

@model{plot_arc_classifier_2025,
  title={Plot Arc Classifier - DeBERTa Small},
  author={Claude Code Assistant},
  year={2025},
  url={https://github.com/your-org/plot-arc-classifier},
  note={Fine-tuned DeBERTa-v3-small for character plot arc classification}
}

Model Card Contact

For questions about this model, please open an issue in the repository or contact the maintainers.

Model trained on 2025-09-02 using transformers library.

Mitchins
/

deberta-v3-s-plot-arc-classifier