lewtun HF staff commited on
Commit
0e26071
·
verified ·
1 Parent(s): 44b6439

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - open-r1/codeforces-cots_decontaminated
5
+ language:
6
+ - en
7
+ base_model:
8
+ - Qwen/Qwen2.5-Coder-32B-Instruct
9
+ pipeline_tag: text-generation
10
+ ---
11
+
12
+ # Model Card for OlympicCoder-32B
13
+ OlympicCoder-32B is a medium sized code model, that achieves strong performance on coding benchmarks such as Live Code Bench and the new International Olympiad in Informatics benchmark.
14
+
15
+ ## Model description
16
+
17
+ - **Model type:** A 32B parameter model fine-tuned on a decontaminated version of the codeforces dataset.
18
+ - **Language(s) (NLP):** Primarily English
19
+ - **License:** apache-2.0
20
+ - **Finetuned from model:** [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct)
21
+
22
+ ## Performance
23
+ | Model | LCB | IOI |
24
+ |-------|-----|---------------|
25
+ |GPT-4o| 28.43 |-
26
+ |Claude 3.7 Sonnet |39.18| 93|
27
+ |QwQ-32B |60.98 | 127|
28
+ |DeepSeek-R1-Distill-Qwen-32B| 56.58| -|
29
+ |DeepSeek-R1-Distill-Qwen-7B |37.36|- |
30
+ |Qwen2.5-Coder-32B-Instruct| 28.31| 35|
31
+ |Qwen2.5-Coder-7B-Instruct |15.83 | 45|
32
+ |DeepSeek-R1 |-| |137|
33
+ |OlympicCoder-7B |36.4 | 129|
34
+
35
+
36
+
37
+
38
+ ## Usage
39
+ Here's how you can run the model using the `pipeline()` function from 🤗 Transformers:
40
+
41
+ ```python
42
+ # pip install transformers
43
+ # pip install accelerate
44
+
45
+ import torch
46
+ from transformers import pipeline
47
+
48
+ pipe = pipeline("text-generation", model="open-r1/OlympicCoder-32B", torch_dtype=torch.bfloat16, device_map="auto")
49
+
50
+ # We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
51
+ messages = [
52
+ {"role": "user", "content": "Write a python program to calculate the 10th Fibonacci number"},
53
+ ]
54
+ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
55
+ outputs = pipe(prompt, max_new_tokens=8000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
56
+ print(outputs[0]["generated_text"])
57
+ #<|im_start|>user
58
+ #Write a python program to calculate the 10th fibonacci number<|im_end|>
59
+ #<|im_start|>assistant
60
+ #<think>Okay, I need to write a Python program that calculates the 10th Fibonacci number. Hmm, the Fibonacci sequence starts with 0 and 1. Each subsequent number is the sum of the two preceding ones. So the sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, and so on. ...
61
+ ```
62
+
63
+
64
+ ## Training procedure
65
+ ### Training hyper-parameters
66
+
67
+ The following hyperparameters were used during training on 16 H100 nodes:
68
+
69
+ - dataset: open-r1/codeforces-cots_decontaminated
70
+ - learning_rate: 4.0e-5
71
+ - train_batch_size: 1
72
+ - seed: 42
73
+ - packing: false
74
+ - distributed_type: fsdp
75
+ - num_devices: 128
76
+ - gradient_accumulation_steps: 1
77
+ - total_train_batch_size: 16
78
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
79
+ - lr_scheduler_type: cosine_with_min_lr
80
+ - min_lr_rate: 0.1
81
+ - lr_scheduler_warmup_ratio: 0.03
82
+ - num_epochs: 10.0