Truthfulness Detection Model

Fine-tuned BERT model for detecting truthfulness in text at both token and sentence levels.

Model Description

This model uses a dual-classifier architecture on top of BERT to:

  • Classify truthfulness at the sentence level (returns probability 0-1)
  • Classify truthfulness for each token (returns probability 0-1 per token)

Low scores indicate likely false statements, high scores indicate likely true statements.

Example Output

For "The earth is flat.":

  • Sentence score: 0.0736 (7.36% - correctly identified as false)
  • Token scores: ~0.10 for each token

Training

  • Base model: bert-base-uncased
  • Training samples: 6,330
  • Epochs: 3
  • Batch size: 16
  • Training time: 49 seconds on H100

Custom Architecture Required

⚠️ This model uses a custom BERTForDualTruthfulness class. You cannot load it with standard AutoModel. See the implementation code for the model class definition.--- license: mit

Downloads last month
3
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support