Spaces:
Paused
Paused
Update documentation of OpenAI compatible server configuration (#1141)
Browse filesUpdate README.md
Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR #1032)
README.md
CHANGED
|
@@ -273,10 +273,12 @@ If `endpoints` are left unspecified, ChatUI will look for the model on the hoste
|
|
| 273 |
|
| 274 |
##### OpenAI API compatible models
|
| 275 |
|
| 276 |
-
Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol).
|
| 277 |
|
| 278 |
The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `v1/chat/completions`, change to `endpoint.completion` to `completions` to use the `v1/completions` endpoint.
|
| 279 |
|
|
|
|
|
|
|
| 280 |
```
|
| 281 |
MODELS=`[
|
| 282 |
{
|
|
@@ -285,15 +287,17 @@ MODELS=`[
|
|
| 285 |
"parameters": {
|
| 286 |
"temperature": 0.9,
|
| 287 |
"top_p": 0.95,
|
| 288 |
-
"repetition_penalty": 1.2,
|
| 289 |
-
"top_k": 50,
|
| 290 |
-
"truncate": 1000,
|
| 291 |
"max_new_tokens": 1024,
|
| 292 |
"stop": []
|
| 293 |
},
|
| 294 |
"endpoints": [{
|
| 295 |
"type" : "openai",
|
| 296 |
-
"baseURL": "http://localhost:8000/v1"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 297 |
}]
|
| 298 |
}
|
| 299 |
]`
|
|
|
|
| 273 |
|
| 274 |
##### OpenAI API compatible models
|
| 275 |
|
| 276 |
+
Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol) and [vllm](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html).
|
| 277 |
|
| 278 |
The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `v1/chat/completions`, change to `endpoint.completion` to `completions` to use the `v1/completions` endpoint.
|
| 279 |
|
| 280 |
+
Parameters not supported by OpenAI (e.g., top_k, repetition_penalty, etc.) must be set in the extraBody of endpoints. Be aware that setting them in parameters will cause them to be omitted.
|
| 281 |
+
|
| 282 |
```
|
| 283 |
MODELS=`[
|
| 284 |
{
|
|
|
|
| 287 |
"parameters": {
|
| 288 |
"temperature": 0.9,
|
| 289 |
"top_p": 0.95,
|
|
|
|
|
|
|
|
|
|
| 290 |
"max_new_tokens": 1024,
|
| 291 |
"stop": []
|
| 292 |
},
|
| 293 |
"endpoints": [{
|
| 294 |
"type" : "openai",
|
| 295 |
+
"baseURL": "http://localhost:8000/v1",
|
| 296 |
+
"extraBody": {
|
| 297 |
+
"repetition_penalty": 1.2,
|
| 298 |
+
"top_k": 50,
|
| 299 |
+
"truncate": 1000
|
| 300 |
+
}
|
| 301 |
}]
|
| 302 |
}
|
| 303 |
]`
|