Update README.md
Browse files
README.md
CHANGED
@@ -29,11 +29,10 @@ Created by: [upstage](https://huggingface.co/upstage)
|
|
29 |
Honestly, I have no idea how YARN models supposed to work out of the box.
|
30 |
Llama.cpp should have proper YARN support but I haven't seen good enough information about real life application of this feature.
|
31 |
Text-Generation-WebUI doesn't set loading params for YARN models automatically even though the params are defined in config.json of the model.
|
32 |
-
Maybe some other apps do it properly but you might need to do it yourself as the model
|
33 |
The model supposedly has 8k context with 16x RoPE scaling.
|
34 |
I tried to load it with 8x scaling and potential 64k and it did seem to output text properly but 8x is max I could set in Text-Generation-WebUI.
|
35 |
-
Didn't test it thoroughly with different scaling and context lengths, can't promise anything
|
36 |
-
As a side note, GGUF conversion process is so blazingly fast compared to exl2, I'm impressed...
|
37 |
|
38 |
## How to run
|
39 |
|
|
|
29 |
Honestly, I have no idea how YARN models supposed to work out of the box.
|
30 |
Llama.cpp should have proper YARN support but I haven't seen good enough information about real life application of this feature.
|
31 |
Text-Generation-WebUI doesn't set loading params for YARN models automatically even though the params are defined in config.json of the model.
|
32 |
+
Maybe some other apps do it properly but you might need to do it yourself as the model seems to output gibberish if scaling params aren't set properly.
|
33 |
The model supposedly has 8k context with 16x RoPE scaling.
|
34 |
I tried to load it with 8x scaling and potential 64k and it did seem to output text properly but 8x is max I could set in Text-Generation-WebUI.
|
35 |
+
Didn't test it thoroughly with different scaling and context lengths, so can't promise anything.
|
|
|
36 |
|
37 |
## How to run
|
38 |
|