Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
adaamko 
posted an update about 16 hours ago
Post
114
We introduced 🥬 TinyLettuce: lightweight hallucination detection with 17–68M encoders

Instead of relying on huge LLM judges that are slow and costly, we built tiny Ettin-based detectors that you can train in hours on a single GPU and run efficiently on CPU.

Here’s what’s inside:
1️⃣ Synthetic Data Generator
A toolkit to create hallucinations with controllable error types—no manual annotation bottlenecks.
2️⃣ TinyLettuce Models (17–68M)
Compact classifiers built on Ettin encoders, designed for efficiency (8K context, modern transformer backbone).
3️⃣ Data & Training Utilities
Scripts and APIs to generate domain-specific labeled pairs at scale, plus ~3.6k examples we used for training.
4️⃣ Open and MIT-licensed
Code, data, and models are freely available for research and production.
5️⃣ Performance Highlights
- TinyLettuce-17M reaches 90.87% F1 (synthetic), outperforming GPT-OSS-120B (83.38%) and Qwen3-235B (79.84%)
- Runs in real-time on CPU: low latency, minimal memory, and pennies per million checks
- Shows competitive results on RAGTruth benchmarks

🔗 Useful links:
Blog: https://huggingface.co/blog/adaamko/tinylettuce
GitHub: https://github.com/KRLabsOrg/LettuceDetect
PyPI: https://pypi.org/project/lettucedetect/
HF Collection: KRLabsOrg/tinylettuce-68b42a66b8b6aaa4bf287bf4
Notebook/Demo:
https://github.com/KRLabsOrg/LettuceDetect/blob/main/demo/tinylettuce.ipynb

⭐ Stars and feedback are always appreciated!
In this post