Add pipeline tag and GitHub link

#5
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -1,20 +1,22 @@
1
  ---
2
- thumbnail: https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/e41oJOWxEvrZYiXstfEH_.png
3
  license: other
4
  license_name: tongyi-qianwen
5
- library_name: transformers
 
6
  ---
7
 
8
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/e41oJOWxEvrZYiXstfEH_.png)
9
 
10
  - Try out the model on [![Featherless](https://img.shields.io/badge/featherless--ai%2FQRWKV--72B-Dummy?style=flat&label=Featherless&color=facc15)](https://featherless.ai/models/featherless-ai/QRWKV-72B)
11
  - Model details from our blog post here! [![Substack](https://img.shields.io/badge/Substack-Dummy?style=flat&color=facc15)](https://substack.recursal.ai/p/qwerky-72b-and-32b-training-large)
12
  - This model was presented in [RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale](https://huggingface.co/papers/2505.03005).
 
13
 
14
  Benchmarks is as follows for both QRWKV-QwQ-32B and QRWKV-72B models:
15
 
16
  | Tasks | Metric | QRWKV-QwQ-32B | Qwen/QwQ-32B | QRWKV-72B | Qwen2.5-72B-Instruct |
17
- |:---:|:---:|:---:|:---:|:---:|:---:|
18
  | arc_challenge | acc_norm | **0.5640** | 0.5563 | **0.6382** | 0.6323 |
19
  | arc_easy | acc_norm | 0.7837 | **0.7866** | **0.8443** | 0.8329 |
20
  | hellaswag | acc_norm | 0.8303 | **0.8407** | 0.8573 | **0.8736** |
@@ -83,4 +85,4 @@ As demonstrated with our QRWKV-72B-Preview and prior models such as QRWKV6-32B I
83
 
84
  As with our previous models, the model's inherent knowledge and dataset training are inherited from its "parent" model. Consequently, unlike previous RWKV models trained on over 100+ languages, the QRWKV model is limited to approximately 30 languages supported by the Qwen line of models.
85
 
86
- You may find our details of the process from our previous release, [here](https://huggingface.co/recursal/QRWKV6-32B-Instruct-Preview-v0.1).
 
1
  ---
2
+ library_name: transformers
3
  license: other
4
  license_name: tongyi-qianwen
5
+ thumbnail: https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/e41oJOWxEvcZYiXstfEH_.png
6
+ pipeline_tag: text-generation
7
  ---
8
 
9
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/e41oJOWxEvcZYiXstfEH_.png)
10
 
11
  - Try out the model on [![Featherless](https://img.shields.io/badge/featherless--ai%2FQRWKV--72B-Dummy?style=flat&label=Featherless&color=facc15)](https://featherless.ai/models/featherless-ai/QRWKV-72B)
12
  - Model details from our blog post here! [![Substack](https://img.shields.io/badge/Substack-Dummy?style=flat&color=facc15)](https://substack.recursal.ai/p/qwerky-72b-and-32b-training-large)
13
  - This model was presented in [RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale](https://huggingface.co/papers/2505.03005).
14
+ - Code: [https://github.com/recursal/RADLADS](https://github.com/recursal/RADLADS)
15
 
16
  Benchmarks is as follows for both QRWKV-QwQ-32B and QRWKV-72B models:
17
 
18
  | Tasks | Metric | QRWKV-QwQ-32B | Qwen/QwQ-32B | QRWKV-72B | Qwen2.5-72B-Instruct |
19
+ |:---:|:---:|:---:|:---:|:---:|:---:|\
20
  | arc_challenge | acc_norm | **0.5640** | 0.5563 | **0.6382** | 0.6323 |
21
  | arc_easy | acc_norm | 0.7837 | **0.7866** | **0.8443** | 0.8329 |
22
  | hellaswag | acc_norm | 0.8303 | **0.8407** | 0.8573 | **0.8736** |
 
85
 
86
  As with our previous models, the model's inherent knowledge and dataset training are inherited from its "parent" model. Consequently, unlike previous RWKV models trained on over 100+ languages, the QRWKV model is limited to approximately 30 languages supported by the Qwen line of models.
87
 
88
+ You may find our details of the process from our previous release, [here](https://huggingface.co/recursal/QRWKV6-32B-Instruct-Preview-v0.1).