Add model card
Browse filesThis PR adds a model card for the paper [Qwen2.5-1M Technical Report](https://huggingface.co/papers/2501.15383).
It also adds the license, library name, pipeline tag and a link to the Github repository.
Please review and merge this PR if everything looks good.
README.md
CHANGED
@@ -15,6 +15,11 @@ library_name: transformers
|
|
15 |
<img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
|
16 |
</a>
|
17 |
|
|
|
|
|
|
|
|
|
|
|
18 |
## Introduction
|
19 |
|
20 |
Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens. Compared to the Qwen2.5 128K version, Qwen2.5-1M demonstrates significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.
|
@@ -170,7 +175,7 @@ for output in outputs:
|
|
170 |
vllm serve Qwen/Qwen2.5-14B-Instruct-1M \
|
171 |
--tensor-parallel-size 4 \
|
172 |
--max-model-len 1010000 \
|
173 |
-
--enable-chunked-prefill --max-num-
|
174 |
--enforce-eager \
|
175 |
--max-num-seqs 1
|
176 |
|
@@ -233,4 +238,4 @@ If you find our work helpful, feel free to give us a cite.
|
|
233 |
journal={arXiv preprint arXiv:2501.15383},
|
234 |
year={2025}
|
235 |
}
|
236 |
-
```
|
|
|
15 |
<img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
|
16 |
</a>
|
17 |
|
18 |
+
This repository contains the model from the paper [Qwen2.5-1M Technical Report](https://huggingface.co/papers/2501.15383).
|
19 |
+
|
20 |
+
Project page: https://qwenlm.github.io/blog/qwen2.5-1m/.
|
21 |
+
Code: https://github.com/QwenLM/Qwen2.5
|
22 |
+
|
23 |
## Introduction
|
24 |
|
25 |
Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens. Compared to the Qwen2.5 128K version, Qwen2.5-1M demonstrates significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.
|
|
|
175 |
vllm serve Qwen/Qwen2.5-14B-Instruct-1M \
|
176 |
--tensor-parallel-size 4 \
|
177 |
--max-model-len 1010000 \
|
178 |
+
--enable-chunked-prefill --max-num-batched_tokens 131072 \
|
179 |
--enforce-eager \
|
180 |
--max-num-seqs 1
|
181 |
|
|
|
238 |
journal={arXiv preprint arXiv:2501.15383},
|
239 |
year={2025}
|
240 |
}
|
241 |
+
```
|