Truthfulness Detection Model
Fine-tuned BERT model for detecting truthfulness in text at both token and sentence levels.
Model Description
This model uses a dual-classifier architecture on top of BERT to:
- Classify truthfulness at the sentence level (returns probability 0-1)
- Classify truthfulness for each token (returns probability 0-1 per token)
Low scores indicate likely false statements, high scores indicate likely true statements.
Example Output
For "The earth is flat.":
- Sentence score: 0.0736 (7.36% - correctly identified as false)
- Token scores: ~0.10 for each token
Training
- Base model: bert-base-uncased
- Training samples: 6,330
- Epochs: 3
- Batch size: 16
- Training time: 49 seconds on H100
Custom Architecture Required
⚠️ This model uses a custom BERTForDualTruthfulness
class. You cannot load it with standard AutoModel.
See the implementation code for the model class definition.---
license: mit
- Downloads last month
- 3