RWKV-Red-Team
/

ARWKV-R1-7B

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Alic-Li commited on 28 days ago

Commit

ec6b439

·

verified ·

1 Parent(s): 42ca9ff

Update README.md

Files changed (1) hide show

README.md +44 -0

README.md CHANGED Viewed

@@ -43,6 +43,50 @@ This is an **early preview** of our 7B parameter pure RNN-based model, trained o
 - 🧮 Math-specific improvements
 - 📚 RL enhanced reasoning model
 ## How to use
 ```bash

 - 🧮 Math-specific improvements
 - 📚 RL enhanced reasoning model
+# Infrence on AMD Radeon GPU By Llama.cpp
+```bash
+git clone https://github.com/MollySophia/llama.cpp.git -b rwkv-v7
+cd llama.cpp
+HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
+    cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release \
+    && cmake --build build --config Release -- -j 16
+cd ./build/bin
+```
+### transform safetensor model to gguf
+```bash
+python ./convert_hf_to_gguf.py [model_dir]
+```
+### model Quantization
+```bash
+./llama-quantize [model_dir] [Quantization accuracy]
+```
+### Infrence model in Webui By llama-server
+```bash
+/llama-server -m [model_dir] -t [use_cpu_thread_number] -ngl 99 --host [host_number] --port [port_number]
+```
+```Radeon 7000 series use gfx1100 & Radeon 6000 series use gfx1030```
+# Infrence on Nvidia GPU By Llama.cpp
+```bash
+git clone https://github.com/MollySophia/llama.cpp.git -b rwkv-v7
+cd llama.cpp
+cmake -B build -DGGML_CUDA=ON
+cmake --build build --config Release
+cd ./build/bin
+```
+### transform safetensor model to gguf
+```bash
+python ./convert_hf_to_gguf.py [model_dir]
+```
+### model Quantization
+```bash
+./llama-quantize [model_dir] [Quantization_accuracy]
+```
+### Infrence model in Webui By llama-server
+```bash
+/llama-server -m [model_dir] -t [use_cpu_thread_number] -ngl 99 --host [host_number] --port [port_number]
+```
 ## How to use
 ```bash