File size: 500 Bytes
11e75ae 229339d 11e75ae 495b7b0 11e75ae 4937f08 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
---
license: apache-2.0
datasets:
- EleutherAI/pile
---
RWKV-7 trained on the Pile w/ "20b tokenizer" (332115325534 tokens)
0.1B = L12-D768, lr 8e-4 to 3e-5 cosine decay, wd 0.1, bsz 8x30x4096
0.4B = L24-D1024, lr 6e-4 to 2e-5 cosine decay, wd 0.1, bsz 8x30x4096
1.5B = L24-D2048, lr 5e-4 to 1.5e-5 cosine decay, wd 0.1, bsz 8x45x4096
Check https://github.com/BlinkDL/RWKV-LM for details.
How to run it:
https://pypi.org/project/rwkv/
or
https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v7
|