license: apache-2.0 | |
Scalability test from small to large | |
ToDo for Me: | |
Qwen 2.5 14B | |
Qwen 2.5 7B | |
Qwen 2.5 3B | |
Phi-4 14B | |
Phi-4-mini 3.8B | |
Gemma 3 12B | |
Gemma 3 4B | |
Architecture: RWKV cxa076 (RWKV x070 based) | |
Now supported only in RWKV-Infer. | |
``` | |
curl http://127.0.0.1:9000/loadmodel -X POST -H "Content-Type: application/json" -d '{"model_filename":"models/PRWKV7-cxa076-qwen3b-stage2final-ctx2048.pth","model_viewname":"PRWKV7-cxa076 Qwen 2.5 3B Stage2 FP8","model_strategy":"fp8", "template":"qwen", "endtoken":"<|im_end|>","default_temperature":"1.0", "default_top_p":"0.3"}' | |
``` |