Add model card
Browse filesThis PR adds a model card for the paper [Qwen2.5-1M Technical Report](https://huggingface.co/papers/2501.15383).
It adds a link to the paper page, the license, a pipeline tag, and code repository.
Please review and merge this PR if everything looks good.
README.md
CHANGED
@@ -1,13 +1,6 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
-
license_link: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-1M/blob/main/LICENSE
|
4 |
-
language:
|
5 |
-
- en
|
6 |
pipeline_tag: text-generation
|
7 |
-
base_model: Qwen/Qwen2.5-7B
|
8 |
-
tags:
|
9 |
-
- chat
|
10 |
-
library_name: transformers
|
11 |
---
|
12 |
|
13 |
# Qwen2.5-7B-Instruct-1M
|
@@ -15,6 +8,8 @@ library_name: transformers
|
|
15 |
<img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
|
16 |
</a>
|
17 |
|
|
|
|
|
18 |
## Introduction
|
19 |
|
20 |
Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens. Compared to the Qwen2.5 128K version, Qwen2.5-1M demonstrates significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
3 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
# Qwen2.5-7B-Instruct-1M
|
|
|
8 |
<img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
|
9 |
</a>
|
10 |
|
11 |
+
This repository contains the model of the paper [Qwen2.5-1M Technical Report](https://huggingface.co/papers/2501.15383).
|
12 |
+
|
13 |
## Introduction
|
14 |
|
15 |
Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens. Compared to the Qwen2.5 128K version, Qwen2.5-1M demonstrates significantly improved performance in handling long-context tasks while maintaining its capability in short tasks.
|