Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
base_model:
|
4 |
+
- Qwen/QwQ-32B
|
5 |
+
---
|
6 |
+
|
7 |
+
Example run:
|
8 |
+
```bash
|
9 |
+
docker run --rm --runtime nvidia --gpus 'all' --ipc=host -e VLLM_WORKER_MULTIPROC_METHOD=spawn -e 'HF_TOKEN' -v '/data/hf_cache:/root/.cache/huggingface' -v '/data/llmcompressor/output/QwQ-32B-FP8-Dynamic:/model' -p 127.0.0.1:8000:8000 "vllm/vllm-openai:v0.7.3" --model 'ig1/QwQ-32B-FP8-Dynamic' --served-model-name 'QwQ-32B' --enable-reasoning --reasoning-parser deepseek_r1 --override-generation-config '{"temperature":0.6,"top_p":0.95}'
|
10 |
+
```
|