license: apache-2.0 | |
datasets: | |
- EleutherAI/pile | |
RWKV-7 trained on the Pile w/ "20b tokenizer" (332115325534 tokens) | |
0.1B = L12-D768, lr 8e-4 to 3e-5 cosine decay, wd 0.1, bsz 8x30x4096 | |
0.4B = L24-D1024, lr 6e-4 to 2e-5 cosine decay, wd 0.1, bsz 8x30x4096 | |
1.5B = L24-D2048, lr 5e-4 to 1.5e-5 cosine decay, wd 0.1, bsz 8x45x4096 | |
Check https://github.com/BlinkDL/RWKV-LM for details. | |
How to run it: | |
https://pypi.org/project/rwkv/ | |
or | |
https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v7 | |