L3-8B-Lunar-Stheno-GGUF

This is quantrized version of HiroseKoichi/L3-8B-Lunar-Stheno created using llama.cpp

Model Description

L3-8B-Lunaris-v1 is definitely a significant improvement over L3-8B-Stheno-v3.2 in terms of situational awareness and prose, but it's not without issues: the response length can sometimes be very long, causing it to go on a rant; it tends to not take direct action, saying that it will do something but never actually doing it; and its performance outside of roleplay took a hit.

This merge fixes all of those issues, and I'm genuinely impressed with the results. While I did use a SLERP merge to create this model, there was no blending of the models; all I did was replace L3-8B-Stheno-v3.2's weights with L3-8B-Lunaris-v1's.

Details

Models Used

Merge Config

models:
    - model: Sao10K/L3-8B-Stheno-v3.2
    - model: Sao10K/L3-8B-Lunaris-v1
merge_method: slerp
base_model: Sao10K/L3-8B-Stheno-v3.2
parameters:
  t:
    - filter: self_attn
      value: 0
    - filter: mlp
      value: 1
    - value: 0
dtype: bfloat16
Downloads last month
52
GGUF
Model size
8.03B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for QuantFactory/L3-8B-Lunar-Stheno-GGUF

Quantized
(7)
this model