install the CUDA toolkit | |
Nvidia Maxwell or higher | |
https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64 | |
Nvidia Kepler or higher | |
https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Windows&target_arch=x86_64 | |
Haven't done much testing but Visual Studio with desktop development for C++ might be required. I've gotten cl.exe errors on a previous install | |
make sure you setup the environment by using windows-setup.bat | |
after everything is done just download a model using download-model.bat | |
to quant, use convert-model-auto.bat. Enter the model's folder name, then the BPW for the model | |
You can always pause the quantization process by pressing Ctrl + C and typing exit. All progress will be stored in the WD (working directory) folder. You can resume where you left off by running the convert-model-auto.bat script with the same arguments you used before. | |
Credit to turboderp for creating exllamav2 and the exl2 quantization method. | |
https://github.com/turboderp | |
Credit to oobabooga the original download script. | |
https://github.com/oobabooga |