File size: 1,343 Bytes
c0c0cd0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
base_model: []
library_name: transformers
tags:
- mergekit
- merge

---
# Llama3.1-SuperDeepFuse

An 8B parameter language model that merges three high-performance distilled models to boost reasoning, instruction-following, and performance in mathematics and coding.

## Model Highlights

- **Size**: 8 billion parameters
- **Base**: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
- **Merged Sources**:
  - [arcee-ai/Llama-3.1-**Super**Nova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite)
  - [deepseek-ai/**Deep**Seek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
  - [FuseAI/**Fuse**Chat-Llama-3.1-8B-Instruct](https://huggingface.co/FuseAI/FuseChat-Llama-3.1-8B-Instruct)
- **Merge Method**: `model_stock`

## Key Capabilities

- Enhanced multi-task reasoning
- Improved mathematical and coding performance
- Multilingual support

## Performance Notes

- Maintains Llama 3.1 safety standards
- Suitable for consumer GPU deployment
- Balanced performance across diverse tasks

## Considerations

- Still being benchmarked
- Capabilities limited compared to larger model variants
- Can give misleading output like all other language models
- Outputs should be independently verified

## Licensing

Follows standard Llama 3.1 usage terms.