littlebird13 commited on
Commit
7bc1816
·
verified ·
1 Parent(s): 8b02ae0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -80,16 +80,17 @@ print("thinking content:", thinking_content)
80
  print("content:", content)
81
  ```
82
 
83
- For deployment, you can use `vllm>=0.8.5` or `sglang>=0.4.5.post2` to create an OpenAI-compatible API endpoint:
 
 
 
 
84
  - vLLM:
85
  ```shell
86
  vllm serve Qwen/Qwen3-8B-FP8 --enable-reasoning --reasoning-parser deepseek_r1
87
  ```
88
 
89
- - SGLang:
90
- ```shell
91
- python -m sglang.launch_server --model-path Qwen/Qwen3-8B-FP8 --reasoning-parser deepseek-r1
92
- ```
93
 
94
  ## Note on FP8
95
 
@@ -126,8 +127,8 @@ However, please pay attention to the following known issues:
126
  ## Switching Between Thinking and Non-Thinking Mode
127
 
128
  > [!TIP]
129
- > The `enable_thinking` switch is also available in APIs created by vLLM and SGLang.
130
- > Please refer to our documentation for [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) and [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) users.
131
 
132
  ### `enable_thinking=True`
133
 
 
80
  print("content:", content)
81
  ```
82
 
83
+ For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.4` or to create an OpenAI-compatible API endpoint:
84
+ - SGLang:
85
+ ```shell
86
+ python -m sglang.launch_server --model-path Qwen/Qwen3-8B-FP8 --reasoning-parser qwen3
87
+ ```
88
  - vLLM:
89
  ```shell
90
  vllm serve Qwen/Qwen3-8B-FP8 --enable-reasoning --reasoning-parser deepseek_r1
91
  ```
92
 
93
+ For local use, applications such as llama.cpp, Ollama, LMStudio, and MLX-LM have also supported Qwen3.
 
 
 
94
 
95
  ## Note on FP8
96
 
 
127
  ## Switching Between Thinking and Non-Thinking Mode
128
 
129
  > [!TIP]
130
+ > The `enable_thinking` switch is also available in APIs created by SGLang and vLLM.
131
+ > Please refer to our documentation for [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) and [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) users.
132
 
133
  ### `enable_thinking=True`
134