feihu.hf
commited on
Commit
·
6b8ee6a
1
Parent(s):
332255d
update README.md
Browse files
README.md
CHANGED
|
@@ -77,7 +77,13 @@ To handle extensive inputs exceeding 32,768 tokens, we utilize [YARN](https://ar
|
|
| 77 |
|
| 78 |
For deployment, we recommend using vLLM. You can enable the long-context capabilities by following these steps:
|
| 79 |
|
| 80 |
-
1. **Install vLLM**:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
|
| 82 |
2. **Configure Model Settings**: After downloading the model weights, modify the `config.json` file by including the below snippet:
|
| 83 |
```json
|
|
|
|
| 77 |
|
| 78 |
For deployment, we recommend using vLLM. You can enable the long-context capabilities by following these steps:
|
| 79 |
|
| 80 |
+
1. **Install vLLM**: You can install vLLM by running the following command.
|
| 81 |
+
|
| 82 |
+
```bash
|
| 83 |
+
pip install "vllm>=0.4.3"
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
Or you can install vLLM from [source](https://github.com/vllm-project/vllm/).
|
| 87 |
|
| 88 |
2. **Configure Model Settings**: After downloading the model weights, modify the `config.json` file by including the below snippet:
|
| 89 |
```json
|