Stephen Oates PRO

soates

AI & ML interests

None yet

Recent Activity

liked a Space 4 days ago

nanotron/ultrascale-playbook

upvoted an article 22 days ago

Open-R1: Update #1

upvoted an article 27 days ago

Open-R1: a fully open reproduction of DeepSeek-R1

View all activity

Organizations

None yet

soates's activity

liked a Space 4 days ago

1.42k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted an article 22 days ago

Article

Open-R1: Update #1

and 7 others •

22 days ago

• 286

upvoted an article 27 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

27 days ago

• 771

upvoted a collection about 1 month ago

EvaByte

Collection

3 items • Updated Jan 21 • 3

upvoted an article about 1 month ago

Article

Mastering Tensor Dimensions in Transformers

•

Jan 12

• 44

upvoted a paper 2 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

liked a model 2 months ago

Datou1111/shou_xin

Text-to-Image • Updated Dec 9, 2024 • 2.18k • 863

upvoted a paper 5 months ago

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 88

liked a model 5 months ago

lamm-mit/LifeGPT

Updated Sep 19, 2024 • 8

upvoted an article 5 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 223

liked a Space 6 months ago

112

Open-LLM performances are plateauing, let’s make the leaderboard steep again

🏔

Update leaderboard for fair model evaluation

upvoted 2 articles 6 months ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

•

Aug 19, 2024

• 77

Article

A failed experiment: Infini-Attention, and why we should keep trying?

Aug 14, 2024

• 59

liked a model 7 months ago

nisten/Biggie-SmoLlm-0.15B-Base

Text Generation • Updated Aug 7, 2024 • 923 • • 233

liked a Space 7 months ago

Gpt2 Multiplication Predictor

📈

Multiply large numbers using different reasoning methods

upvoted an article 9 months ago

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 448

liked a Space 9 months ago

775

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality web text data for LLM training

liked a Space 10 months ago

270

Phi-3 WebGPU

🚀

A private and powerful AI that runs locally in your browser

updated a collection 10 months ago

Llms

Collection

2 items • Updated May 8, 2024