Beyond Correlation: Interpretable Evaluation of Machine Translation Metrics Paper • 2410.05183 • Published Oct 7, 2024 • 1
Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering Paper • 2503.14996 • Published Mar 19, 2025 • 3
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 756
Stress-Testing MGT Detecors via Stylistic Alignment Collection Dataset and Models for the ACL 2025 paper "Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors" • 10 items • Updated Jul 4, 2025 • 1
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization Paper • 2506.10920 • Published Jun 12, 2025 • 5
Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors Paper • 2505.24523 • Published May 30, 2025 • 9
Evaluating Lexical Proficiency in Neural Language Models Collection Public collection for our paper: "Evaluating Lexical Proficiency in Neural Language Models", C. Ciaccio, A. Miaschi, F. Dell'Orletta (ACL 2025) • 5 items • Updated May 26, 2025 • 2
Steering Large Language Models for Machine Translation Personalization Paper • 2505.16612 • Published May 22, 2025 • 6
ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models Paper • 2505.13180 • Published May 19, 2025 • 13
EuroBERT Collection Scaling Multilingual Encoders for European Languages • 4 items • Updated Mar 10, 2025 • 13
Gemma Neogenesis 💎🌍🇮🇹 Collection Datasets and models for Neogenesis: Post-training recipe for improving Gemma 2 for a specific language. Notebook: https://t.ly/iuKdy • 12 items • Updated Oct 25, 2025 • 5
🇮🇹👓 LLaVA-NDiNO Collection HF Collection for the models of the paper "LLaVA-NDiNO: Empowering LLMs with Multimodality for the Italian Language" • 7 items • Updated Oct 20, 2024 • 3
Pythia Scaling Suite Collection Pythia is the first LLM suite designed specifically to enable scientific research on LLMs. To learn more see https://github.com/EleutherAI/pythia • 18 items • Updated Feb 26, 2025 • 31
ITA-Bench: Italian Benchmarks for LLMs Collection A collection of Italian benchmarks for Large Language Models. See also our Github repo: https://github.com/SapienzaNLP/ita-bench • 23 items • Updated Nov 22, 2025 • 8
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9, 2024 • 40