RWKV
/

RWKV7-Goose-Pile-168M-HF

Text Generation

Model card Files Files and versions

SmerkyG commited on Jul 23

Commit

cd3a3f2

·

verified ·

1 Parent(s): 3c3f5bb

Upload folder using huggingface_hub

Files changed (2) hide show

README.md +8 -8
model.safetensors +2 -2

README.md CHANGED Viewed

@@ -1,15 +1,15 @@
 ---
-license: apache-2.0
 datasets:
 - EleutherAI/the_pile_deduplicated
 language:
 - en
 metrics:
 - accuracy
-base_model:
-- BlinkDL/rwkv-7-pile
 pipeline_tag: text-generation
-library_name: transformers
 ---
 # rwkv7-168M-pile
@@ -38,16 +38,16 @@ This is RWKV-7 model under flash-linear attention format.
 <!-- Provide the basic links for the model. -->
 - **Repository:** https://github.com/fla-org/flash-linear-attention ; https://github.com/BlinkDL/RWKV-LM
-- **Paper:** [RWKV-7 "Goose" with Expressive Dynamic State Evolution](https://arxiv.org/abs/2503.14456)
 - **Weights:** Converted from https://modelscope.cn/models/RWKV/rwkv-7-pile/file/view/master?fileName=RWKV-x070-Pile-168M-20241120-ctx4096.pth
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-Install `flash-linear-attention` <= 0.1.2 and the latest version of `transformers` before using this model:
 ```bash
-pip install --no-use-pep517 flash-linear-attention==0.1.2
 pip install 'transformers>=4.48.0'
 ```
@@ -82,4 +82,4 @@ This model is trained on the Pile with a total of 332 billion tokens.
 ## FAQ
 Q: safetensors metadata is none.
-A: upgrade transformers to >=4.48.0: `pip install 'transformers>=4.48.0'`

 ---
+base_model:
+- BlinkDL/rwkv-7-pile
 datasets:
 - EleutherAI/the_pile_deduplicated
 language:
 - en
+license: apache-2.0
 metrics:
 - accuracy
 pipeline_tag: text-generation
+library_name: rwkv
 ---
 # rwkv7-168M-pile
 <!-- Provide the basic links for the model. -->
 - **Repository:** https://github.com/fla-org/flash-linear-attention ; https://github.com/BlinkDL/RWKV-LM
+- **Paper:** https://huggingface.co/papers/2503.14456
 - **Weights:** Converted from https://modelscope.cn/models/RWKV/rwkv-7-pile/file/view/master?fileName=RWKV-x070-Pile-168M-20241120-ctx4096.pth
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+Install `flash-linear-attention` and the latest version of `transformers` before using this model:
 ```bash
+pip install git+https://github.com/fla-org/flash-linear-attention
 pip install 'transformers>=4.48.0'
 ```
 ## FAQ
 Q: safetensors metadata is none.
+A: upgrade transformers to >=4.48.0: `pip install 'transformers>=4.48.0'`

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f7ba6afcdc9e797ca225f413cd0d2c2e7fab2012dd15d133ec6e169949829184
-size 670588632

 version https://git-lfs.github.com/spec/v1
+oid sha256:8fbeda2b50f0a09f6c98f4d263a542e2bb81d2beda353ee3c72d8c1576efd65a
+size 335318368