Commits · yusufs/vllm-inference

feat(runner.sh): DeepSeek-R1-Distill-Qwen-32B d66bcfc2f3fd52799f95943264f32ba15ca0003d

148829b

yusufs commited on Jan 28

feat(runner.sh): --trust-remote-code

1530e6e

yusufs commited on Jan 28

feat(runner.sh): add deepseek-ai/DeepSeek-R1 and deepseek-ai/DeepSeek-V3

57f9fa5

yusufs commited on Jan 28

feat(runner.sh): only enable prefix caching and disable log request

c0cde8e

yusufs commited on Jan 28

feat(runner.sh): --enable-chunked-prefill and --enable-prefix-caching for faster generate

8c5a84b

yusufs commited on Jan 28

fix(runner.sh): enable eager mode (disabling cuda graph)

5bd7bc7

yusufs commited on Jan 20

fix(runner.sh): --enforce-eager not support values

cb15911

yusufs commited on Jan 20

fix(runner.sh): explicitly disabling enforce_eager

266e7dd

yusufs commited on Jan 20

fix(runner.sh): disable eager-loading so it using cuda graph (in order for parallel and faster processing)

6bb48e9

yusufs commited on Jan 20

feat(runner.sh): add specific task and code revision

dc19c1d

yusufs commited on Dec 31, 2024

feat(runner.sh): using MODEL_ID only

490e6a3

yusufs commited on Dec 26, 2024

feat(runner.sh): using runner.sh to select llm in the run time

69c6372

yusufs commited on Dec 26, 2024

Spaces:

yusufs
/

vllm-inference

Paused

Commit History

feat(runner.sh): DeepSeek-R1-Distill-Qwen-32B d66bcfc2f3fd52799f95943264f32ba15ca0003d

148829b

feat(runner.sh): --trust-remote-code

1530e6e

feat(runner.sh): add deepseek-ai/DeepSeek-R1 and deepseek-ai/DeepSeek-V3

57f9fa5

feat(runner.sh): only enable prefix caching and disable log request

c0cde8e

feat(runner.sh): --enable-chunked-prefill and --enable-prefix-caching for faster generate

8c5a84b

fix(runner.sh): enable eager mode (disabling cuda graph)

5bd7bc7

fix(runner.sh): --enforce-eager not support values

cb15911

fix(runner.sh): explicitly disabling enforce_eager

266e7dd

fix(runner.sh): disable eager-loading so it using cuda graph (in order for parallel and faster processing)

6bb48e9

feat(runner.sh): add specific task and code revision

dc19c1d

feat(runner.sh): using MODEL_ID only

490e6a3

feat(runner.sh): using runner.sh to select llm in the run time

69c6372

Commit History

feat(runner.sh): DeepSeek-R1-Distill-Qwen-32B d66bcfc2f3fd52799f95943264f32ba15ca0003d 148829b

feat(runner.sh): --trust-remote-code 1530e6e

feat(runner.sh): add deepseek-ai/DeepSeek-R1 and deepseek-ai/DeepSeek-V3 57f9fa5

feat(runner.sh): only enable prefix caching and disable log request c0cde8e

feat(runner.sh): --enable-chunked-prefill and --enable-prefix-caching for faster generate 8c5a84b

fix(runner.sh): enable eager mode (disabling cuda graph) 5bd7bc7

fix(runner.sh): --enforce-eager not support values cb15911

fix(runner.sh): explicitly disabling enforce_eager 266e7dd

fix(runner.sh): disable eager-loading so it using cuda graph (in order for parallel and faster processing) 6bb48e9

feat(runner.sh): add specific task and code revision dc19c1d

feat(runner.sh): using MODEL_ID only 490e6a3

feat(runner.sh): using runner.sh to select llm in the run time 69c6372

feat(runner.sh): DeepSeek-R1-Distill-Qwen-32B d66bcfc2f3fd52799f95943264f32ba15ca0003d

148829b

feat(runner.sh): --trust-remote-code

1530e6e

feat(runner.sh): add deepseek-ai/DeepSeek-R1 and deepseek-ai/DeepSeek-V3

57f9fa5

feat(runner.sh): only enable prefix caching and disable log request

c0cde8e

feat(runner.sh): --enable-chunked-prefill and --enable-prefix-caching for faster generate

8c5a84b

fix(runner.sh): enable eager mode (disabling cuda graph)

5bd7bc7

fix(runner.sh): --enforce-eager not support values

cb15911

fix(runner.sh): explicitly disabling enforce_eager

266e7dd

fix(runner.sh): disable eager-loading so it using cuda graph (in order for parallel and faster processing)

6bb48e9

feat(runner.sh): add specific task and code revision

dc19c1d

feat(runner.sh): using MODEL_ID only

490e6a3

feat(runner.sh): using runner.sh to select llm in the run time

69c6372