13 10 15

Garreth Lee

garrethlee

AI & ML interests

None yet

Recent Activity

upvoted an article 24 days ago

FastRTC: The Real-Time Communication Library for Python

liked a Space 29 days ago

nanotron/ultrascale-playbook

upvoted an article about 1 month ago

1 Billion Classifications

View all activity

Organizations

garrethlee's activity

liked a Space 29 days ago

2.33k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a model about 2 months ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 26 days ago • 1.68M • • 11.5k

liked a dataset 3 months ago

HuggingFaceFW/fineweb-2

Viewer • Updated Jan 8 • 12.5B • 70.8k • 448

liked a Space 3 months ago

Number Tokenization Blog

📈

Explore how tokenization affects arithmetic in LLMs

liked 2 Spaces 4 months ago

468

Synthetic Data Generator

🧬

Build datasets using natural language

Hub LFS Analysis

📈

An analysis of LFS files on the Hub.

liked a model 4 months ago

GoToCompany/gemma2-9b-cpt-sahabatai-v1-instruct

Updated Nov 6, 2024 • 2.23k • 35

liked a Space 4 months ago

Sahabat-AI Chatbot (Gemma2 9b)

😻

Chatbot

liked 2 datasets 4 months ago

indolem/IndoMMLU

Updated Oct 11, 2023 • 19.5k • 17

PleIAs/common_corpus

Viewer • Updated Feb 11 • 470M • 42.7k • 245

liked 2 Spaces 5 months ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

Evaluate multilingual models using FineTasks

106

TxT360: Trillion Extracted Text

📖

Create a large, deduplicated dataset for LLM pre-training

liked a Space 6 months ago

919

Model Memory Utility

🚀

Calculate memory needed to train AI models

liked a Space 7 months ago

889

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality web text data for LLM training

liked a model 12 months ago

mistralai/Mistral-7B-Instruct-v0.2

Text Generation • Updated Sep 27, 2024 • 3.86M • • 2.69k