base_model: []
library_name: transformers
tags:
- mergekit
- merge
Stellar Odyssey 12b v0.0
Join my dream, it's just the right time, whoa... Leave it all behind... Get ready now... Riise up into my world~
Listen to the song on youtube: https://www.youtube.com/watch?v=npyiiInMA0w
This is my second attempt at a model merge, This time, these models were used
- mistralai/Mistral-Nemo-Base-2407
- Sao10K/MN-12B-Lyra-v4
- nothingiisreal/MN-12B-Starcannon-v2
- Gryphe/Pantheon-RP-1.5-12b-Nemo
License for this model is: Apache 2.0 (due to the base model, Mistral Nemo Base 2407)
Intended Use case: Roleplay
Instruction Format: ChatML
Thank you to AuriAetherwiing for helping me merge the models.
Data?
This is a hard question to answer, I didn't add any data to the model itself, rather it's a merge of other models, so the data used for them applies to this model too, though it won't be the same.
Merge Details
Merge Method
This model was merged using the della_linear merge method using mistralai/Mistral-Nemo-Base-2407 as a base.
Models Merged
The following models were included in the merge:
- Sao10K/MN-12B-Lyra-v4
- Gryphe/Pantheon-RP-1.5-12b-Nemo
- nothingiisreal/MN-12B-Starcannon-v2
Configuration
The following YAML configuration was used to produce this model:
models:
- model: C:\Users\\Downloads\Mergekit-Fixed\mergekit\Sao10K_MN-12B-Lyra-v4
parameters:
weight: 0.3
density: 0.25
- model: C:\Users\\Downloads\Mergekit-Fixed\mergekit\nothingiisreal_MN-12B-Starcannon-v2
parameters:
weight: 0.1
density: 0.4
- model: C:\Users\\Downloads\Mergekit-Fixed\mergekit\Gryphe_Pantheon-RP-1.5-12b-Nemo
parameters:
weight: 0.4
density: 0.5
merge_method: della_linear
base_model: C:\Users\\Downloads\Mergekit-Fixed\mergekit\mistralai_Mistral-Nemo-Base-2407
parameters:
epsilon: 0.05
lambda: 1
merge_method: della_linear
dtype: bfloat16
Notes
Della_Linear: Refer to https://arxiv.org/abs/2406.11617 and https://arxiv.org/abs/2212.04089, as it is quite long to explain what Della_Linear is BFloat16: Brain Floating Point 16, a way to run models faster on Nvidia GPUs Density: Fraction of weights in differences from the base model to retain Epsilon: Maximum change in drop probability based on magnitude. Drop probabilities assigned will range