powermove72
/

NeuralPipe-4B-Turbo-v6

Text Generation

text-generation-inference

Model card Files Files and versions

NeuralPipe-4B-Turbo-v6 / README.md

powermove72's picture

Upload folder using huggingface_hub

a8f0118 verified 4 months ago

|

history blame contribute delete

1.52 kB

	---
	base_model:
	- NousResearch/Hermes-3-Llama-3.2-3B
	- EpistemeAI/ReasoningCore-Llama-3B-R1-aligned
	library_name: transformers
	tags:
	- mergekit
	- merge

	---
	# merge1

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the Passthrough merge method using merge as a base.

	### Models Merged

	The following models were included in the merge:
	* [NousResearch/Hermes-3-Llama-3.2-3B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B)
	* [EpistemeAI/ReasoningCore-Llama-3B-R1-aligned](https://huggingface.co/EpistemeAI/ReasoningCore-Llama-3B-R1-aligned)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml

	merge_method: passthrough
	dtype: bfloat16
	base_model: merge

	slices:
	- sources:
	- model: EpistemeAI/ReasoningCore-Llama-3B-R1-aligned
	layer_range: [0, 12]
	parameters:
	weight: [0.6, 0.65, 0.7] # Entropy-driven: List simulate
	- sources:
	- model: NousResearch/Hermes-3-Llama-3.2-3B
	layer_range: [8, 22]
	parameters:
	weight: 0.85 # Hyperscaled for transcendent boost
	- sources:
	- model: merge # Reference the Step 1 output directory/model
	layer_range: [16, 28]
	tensors: ["self_attn"]
	parameters:
	weight: 0.75 # Optimized via simulated genius annealing
	tensors: ["mlp"]
	parameters:
	weight: 0.95
	dtype: float16

	```