agentlans
/

Llama3.1-SuperDeepFuse

Text Generation

text-generation-inference

Model card Files Files and versions Community

Llama3.1-SuperDeepFuse / README.md

agentlans's picture

Add model safetensor files

c0c0cd0 2 months ago

|

1.34 kB

	---
	base_model: []
	library_name: transformers
	tags:
	- mergekit
	- merge

	---
	# Llama3.1-SuperDeepFuse

	An 8B parameter language model that merges three high-performance distilled models to boost reasoning, instruction-following, and performance in mathematics and coding.

	## Model Highlights

	- Size: 8 billion parameters
	- Base: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
	- Merged Sources:
	- [arcee-ai/Llama-3.1-SuperNova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite)
	- [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
	- [FuseAI/FuseChat-Llama-3.1-8B-Instruct](https://huggingface.co/FuseAI/FuseChat-Llama-3.1-8B-Instruct)
	- Merge Method: `model_stock`

	## Key Capabilities

	- Enhanced multi-task reasoning
	- Improved mathematical and coding performance
	- Multilingual support

	## Performance Notes

	- Maintains Llama 3.1 safety standards
	- Suitable for consumer GPU deployment
	- Balanced performance across diverse tasks

	## Considerations

	- Still being benchmarked
	- Capabilities limited compared to larger model variants
	- Can give misleading output like all other language models
	- Outputs should be independently verified

	## Licensing

	Follows standard Llama 3.1 usage terms.