EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published 7 days ago • 72
SynthDetoxM Collection Data and models from NAACL 2025 paper "SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators" by Moskovskiy et al. • 4 items • Updated 8 days ago • 2
When an LLM is apprehensive about its answers -- and when its uncertainty is justified Paper • 2503.01688 • Published 11 days ago • 19
GHOST 2.0: generative high-fidelity one shot transfer of heads Paper • 2502.18417 • Published 16 days ago • 63
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published 21 days ago • 162
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? Paper • 2502.14502 • Published 22 days ago • 85
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published 23 days ago • 67
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators Paper • 2502.06394 • Published Feb 10 • 86
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators Paper • 2502.06394 • Published Feb 10 • 86
DTF Collection Finetune of Qwen-2.5-7B model on a dump of DTF posts and comments. • 3 items • Updated Feb 7