free-solar-0.3-exl2
Original model: free-solar-slerp-v0.3
Model creator: freewheelin
Quants
4bpw h6 (main)
4.25bpw h6
4.65bpw h6
5bpw h6
6bpw h6
8bpw h8
Quantization notes
Made with Exllamav2 0.0.15 with the default dataset.
This model has unusually long loading times, normal 11B models take about 30s on my PC but this one loads in 130s.
I have no clue why it has such long loading times, maybe because it was originally FP32 instead of usual FP16.
But overall VRAM usage and generation speed seems to be rather normal.
This seems to be primarily a Korean language model.
I didn't realize it at first when I tried it since the language wasn't explictly listed.
I'm unable to evaluate it in its main area but it seems to be usable in English and to some degree in other languages.
When using it in English, sometimes it seems to randomly switch topics or starts writing in Korean as if it occasionally forgets to write the stopping token.
But it has an interesting writing style in English and overall seems to be quite reasonable, so I decided to make full quants.
And I was curious to try to quantize a FP32 model, RAM requirements were higher but overall process went smooth without any issues.
How to run
This quantization method uses GPU and requires Exllamav2 loader which can be found in following applications:
Original model card
free-solar-0.3
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the SLERP merge method.
Models Merged
The following models were included in the merge:
- Downloads last month
- 32
Model tree for cgus/free-solar-slerp-v0.3-exl2
Base model
freewheelin/free-solar-slerp-v0.3