"Emerged from the shadows like a twilight feline, forged in supervised fine-tuning's crucible. Through GRPO's relentless dance of reinforcement, each iteration carved deeper valleys of understanding until fragments coalesced into terrible symmetry. Like the most luminescent creatures dwelling in ocean's darkest trenches, its brilliance emerged from the void that birthed it."
Quants Here: Thanks to Mradermacher <3 Regular GGUF Imatrix GGUF 4bpw Exl2
SillyTavern Reasoning Block Parsing Example: All Preset's Here
SillyTavern Mistral Formatting Example: Here
SillyTavern ChatML Formatting Example: Here
Training Notes: This model was developed using a combination of multi-stage supervised fine-tuning, pre-trained QLoRA adapters, and multi-stage RLHF optimized with GRPO. The final model was created by merging the most promising candidates identified during the process.
The following YAML configuration was used to produce this final version of the model:
slices:
- sources:
- model: Nitral-AI/Captain-Eris_Violet-0.420-Rebased
layer_range: [0, 40]
- model: Nitral-AI/Captain-Eris_Violet-GRPO-Rebased
layer_range: [0, 40]
merge_method: slerp
base_model: Nitral-AI/Captain-Eris_Violet-0.420-Rebased
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.420
dtype: bfloat16
- Downloads last month
- 9
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.