R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Paper • 2503.05592 • Published 6 days ago • 24
🧠 Abliteration Collection Uncensored models using abliteration. See this article for more information: huggingface.co/blog/mlabonne/abliteration • 7 items • Updated Nov 18, 2024 • 30
VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues Paper • 2502.12084 • Published 24 days ago • 29
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published 22 days ago • 38
Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model Paper • 2502.13449 • Published 23 days ago • 42
SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published 21 days ago • 92
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published 21 days ago • 162
Qwen2.5-3B-GRPO Collection Trained with unsloth on just 250 steps (resource constraints) on GSM8K to add reasoning abilities to Qwen2.5-3B (smaller model because resources) • 3 items • Updated 18 days ago