Safetensors
qwen2
nielsr HF Staff commited on
Commit
7d505a5
·
verified ·
1 Parent(s): 93e0fc8

Add pipeline tag, link to paper, add library name

Browse files

This PR adds the relevant `pipeline_tag` to the model card, as well as a link to the paper and adds the `library_name`.

Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -1,15 +1,16 @@
1
  ---
2
- license: apache-2.0
3
  datasets:
4
  - PrimeIntellect/Intellect-2-RL-Dataset
 
 
 
5
  ---
6
 
7
  # INTELLECT-2
8
 
9
  INTELLECT-2 is a 32 billion parameter language model trained through a reinforcement learning run leveraging globally distributed, permissionless GPU resources contributed by the community.
10
 
11
- The model was trained using [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl), a framework designed for distributed asynchronous RL, using GRPO over verifiable rewards along with modifications for improved training stability. For detailed information on our infrastructure and training recipe, see our [technical report](https://www.primeintellect.ai/intellect-2).
12
-
13
 
14
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a32edf17b9f57eaec2ea65/KxI7k7byQs4ATme0naIzV.png)
15
 
@@ -23,7 +24,7 @@ The model was trained using [prime-rl](https://github.com/PrimeIntellect-ai/prim
23
 
24
  INTELLECT-2 is based on the `qwen2` architecture, making it compatible with popular libraries and inference engines such as [vllm](https://github.com/vllm-project/vllm) or [sglang](https://github.com/sgl-project/sglang).
25
 
26
- Given that INTELLECT-2 was trained with a length control budget, you will achieve the best results by appending the prompt `"Think for 10000 tokens before giving a response."` to your instruction. As reported in our technical report, the model did not train for long enough to fully learn the length control objective, which is why results won't differ strongly if you specify lengths other than 10,000. If you wish to do so, you can expect the best results with 2000, 4000, 6000 and 8000, as these were the other target lengths present during training.
27
 
28
  ## Performance
29
 
@@ -38,8 +39,6 @@ During training, INTELLECT-2 improved upon QwQ in its mathematical and coding ab
38
  | Qwen-R1-Distill-32B | 69.9 | 58.4 | 55.1 | 65.2 | 72.0 |
39
  | Deepseek-R1 | 78.6 | 65.1 | 64.1 | 71.6 | 82.7 |
40
 
41
-
42
-
43
  ## Citation
44
 
45
  Feel free to cite INTELLECT-2:
@@ -54,4 +53,4 @@ Feel free to cite INTELLECT-2:
54
  primaryClass={cs.LG},
55
  url={https://arxiv.org/abs/2505.07291},
56
  }
57
- ```
 
1
  ---
 
2
  datasets:
3
  - PrimeIntellect/Intellect-2-RL-Dataset
4
+ license: apache-2.0
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
  ---
8
 
9
  # INTELLECT-2
10
 
11
  INTELLECT-2 is a 32 billion parameter language model trained through a reinforcement learning run leveraging globally distributed, permissionless GPU resources contributed by the community.
12
 
13
+ The model was trained using [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl), a framework designed for distributed asynchronous RL, using GRPO over verifiable rewards along with modifications for improved training stability. For detailed information on our infrastructure and training recipe, see our [technical report](https://www.primeintellect.ai/intellect-2) and paper [INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning](https://huggingface.co/papers/2505.07291).
 
14
 
15
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a32edf17b9f57eaec2ea65/KxI7k7byQs4ATme0naIzV.png)
16
 
 
24
 
25
  INTELLECT-2 is based on the `qwen2` architecture, making it compatible with popular libraries and inference engines such as [vllm](https://github.com/vllm-project/vllm) or [sglang](https://github.com/sgl-project/sglang).
26
 
27
+ Given that INTELLECT-2 was trained with a length control budget, you will achieve the best results by appending the prompt `\"Think for 10000 tokens before giving a response.\"` to your instruction. As reported in our technical report, the model did not train for long enough to fully learn the length control objective, which is why results won't differ strongly if you specify lengths other than 10,000. If you wish to do so, you can expect the best results with 2000, 4000, 6000 and 8000, as these were the other target lengths present during training.
28
 
29
  ## Performance
30
 
 
39
  | Qwen-R1-Distill-32B | 69.9 | 58.4 | 55.1 | 65.2 | 72.0 |
40
  | Deepseek-R1 | 78.6 | 65.1 | 64.1 | 71.6 | 82.7 |
41
 
 
 
42
  ## Citation
43
 
44
  Feel free to cite INTELLECT-2:
 
53
  primaryClass={cs.LG},
54
  url={https://arxiv.org/abs/2505.07291},
55
  }
56
+ ```