BigCode

Enterprise

non-profit

https://www.bigcode-project.org/

bigcode-project

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

wyu1 authored a paper 2 days ago

OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

terryyz updated a dataset 7 days ago

bigcode/bigcodebench-hard-results

terryyz updated a dataset 7 days ago

bigcode/bigcodebench-hard-solve-rate

View all activity

Articles

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

bigcode's activity

lewtun

posted an update 6 days ago

Post

9480

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1

5 replies

·

terryyz

updated 5 datasets 7 days ago

bigcode/bigcodebench-hard-results

Viewer • Updated 7 days ago • 163 • 93 • 1

bigcode/bigcodebench-hard-solve-rate

Viewer • Updated 7 days ago • 296 • 109 • 1

bigcode/bigcodebench-hard-domain

Viewer • Updated 7 days ago • 297 • 93 • 1

bigcode/bigcodebench-hard-perf

Viewer • Updated 7 days ago • 297 • 90

bigcode/bigcodebench-results

Viewer • Updated 7 days ago • 163 • 99 • 2

terryyz

updated a Space 8 days ago

BigCodeBench Leaderboard

terryyz

updated 3 datasets 8 days ago

bigcode/bigcodebench-solve-rate

Viewer • Updated 8 days ago • 2.28k • 86

bigcode/bigcodebench-domain

Viewer • Updated 8 days ago • 247 • 84

bigcode/bigcodebench-perf

Viewer • Updated 8 days ago • 247 • 67

terryyz

updated a Space 9 days ago

BigCodeBench Evaluator

terryyz

in bigcode/bigcode-models-leaderboard 11 days ago

Phi4 - Is it the new best small model?

#89 opened 14 days ago by

terryyz

updated a model 17 days ago

bigcode/bcb_update

Updated 17 days ago

terryyz

updated 4 datasets 17 days ago

bigcode/bigcodebench

Viewer • Updated 17 days ago • 4.56k • 9.82k • 49

bigcode/bigcodebench-hard

Viewer • Updated 17 days ago • 592 • 3.4k • 1

bigcode/bigcodebench-hard-elo

Viewer • Updated 17 days ago • 268 • 76

bigcode/bigcodebench-elo

Viewer • Updated 17 days ago • 222 • 82

terryyz

in bigcode/bigcodebench 17 days ago

Incorrect URL in tests for example 1005

#3 opened 18 days ago by

LouisCastricato

authored a paper 18 days ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 22 days ago • 90

BrigitteTousi

posted an update 21 days ago

Post

1068

Community fine-tuned models are more carbon efficient than the models they are derived from! 🥳🌿

@alozowski @clefourrier @SaylorTwift @albertvillanova evaluated CO₂ emissions associated with model inference for over 3000 models on the Open LLM Leaderboard. Interesting trends and new insights emerged...👀

Blog Post: https://huggingface.co/blog/leaderboard-emissions-analysis

Leaderboard: open-llm-leaderboard/open_llm_leaderboard