Steffen Röcker's picture

Steffen Röcker PRO

sroecker

·

https://x.com/sroecker

AI & ML interests

Local models

Recent Activity

liked a model 2 days ago

simplescaling/s1-32B

liked a model 3 days ago

ibm-granite/granite-vision-3.1-2b-preview

upvoted a collection 3 days ago

View all activity

Organizations

sroecker's activity

upvoted a collection 3 days ago

EuroLLM

4 items • Updated Dec 2, 2024 • 27

upvoted a paper 3 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 10 days ago • 100

upvoted an article 7 days ago

Article

Replicating DeepSeek R1 for Information Extraction

By

•

7 days ago

• 29

upvoted a collection 7 days ago

R1 Multilingual

5 items • Updated 7 days ago • 7

upvoted a paper 7 days ago

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

Paper • 2501.18511 • Published 8 days ago • 17

upvoted a collection 7 days ago

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes. • 10 items • Updated 8 days ago • 88

upvoted an article 10 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

10 days ago

• 657

upvoted a collection 11 days ago

Quantized DeepSeek R1 Distill

3 items • Updated 16 days ago • 3

upvoted a collection 13 days ago

DeepSeek-R1-abliterated

7 items • Updated 7 days ago • 41

upvoted a collection 14 days ago

Language Detection

StaticVectors models to detect language. Exports of FastText that run in NumPy without needing FastText • 2 items • Updated 12 days ago • 3

upvoted a collection 16 days ago

DeepSeek R1 AWQ

7 items • Updated 16 days ago • 4

upvoted a paper 19 days ago

CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval

Paper • 2411.12644 • Published Nov 19, 2024 • 3

upvoted a collection 23 days ago

InternLM3

6 items • Updated 21 days ago • 21

upvoted an article 24 days ago

Article

Python Is All You Need? Introducing Dria-Agent-α

By

and 1 other •

28 days ago

• 22

upvoted a paper 29 days ago

KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model

Paper • 2501.01028 • Published Jan 2 • 13

upvoted a collection 29 days ago

KaLM-embedding

8 items • Updated 1 day ago • 22

upvoted an article 30 days ago

Article

Synthetic Data Generation with FastData and Hugging Face

By

•

about 1 month ago

• 14

upvoted an article about 1 month ago

Article

Fine-tune a SmolLM on domain-specific synthetic data from a LLM

By

•

Jan 3

• 32

upvoted a collection about 1 month ago

Dolphin 3.0

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated about 8 hours ago • 65

upvoted an article about 1 month ago

Article

Upgrading Kokoro: natural TTS for short bursts

By

•

Nov 22, 2024

• 26