nielsr HF staff commited on
Commit
7640c29
·
verified ·
1 Parent(s): d5c878c

Add model card

Browse files

This PR adds a model card for the paper [Qwen2.5-1M Technical Report](https://huggingface.co/papers/2501.15383).

It also adds the license, library name, pipeline tag and a link to the Github repository.

Please review and merge this PR if everything looks good.

Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -15,6 +15,11 @@ library_name: transformers
15
  <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
16
  </a>
17
 
 
 
 
 
 
18
  ## Introduction
19
 
20
  Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens. Compared to the Qwen2.5 128K version, Qwen2.5-1M demonstrates significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.
@@ -170,7 +175,7 @@ for output in outputs:
170
  vllm serve Qwen/Qwen2.5-14B-Instruct-1M \
171
  --tensor-parallel-size 4 \
172
  --max-model-len 1010000 \
173
- --enable-chunked-prefill --max-num-batched-tokens 131072 \
174
  --enforce-eager \
175
  --max-num-seqs 1
176
 
@@ -233,4 +238,4 @@ If you find our work helpful, feel free to give us a cite.
233
  journal={arXiv preprint arXiv:2501.15383},
234
  year={2025}
235
  }
236
- ```
 
15
  <img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
16
  </a>
17
 
18
+ This repository contains the model from the paper [Qwen2.5-1M Technical Report](https://huggingface.co/papers/2501.15383).
19
+
20
+ Project page: https://qwenlm.github.io/blog/qwen2.5-1m/.
21
+ Code: https://github.com/QwenLM/Qwen2.5
22
+
23
  ## Introduction
24
 
25
  Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens. Compared to the Qwen2.5 128K version, Qwen2.5-1M demonstrates significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.
 
175
  vllm serve Qwen/Qwen2.5-14B-Instruct-1M \
176
  --tensor-parallel-size 4 \
177
  --max-model-len 1010000 \
178
+ --enable-chunked-prefill --max-num-batched_tokens 131072 \
179
  --enforce-eager \
180
  --max-num-seqs 1
181
 
 
238
  journal={arXiv preprint arXiv:2501.15383},
239
  year={2025}
240
  }
241
+ ```