The model is derived from Llama-2-7b-hf through pruning using LLM-Streamline (Streamlining Redundant Layers to Compress Large Language Models, ICLR 2025 Spotlight). The entire training process required only 0.06B tokens.

Below are the results of the evaluation using lm-eval:

	arc_c	arc_e	boolq	hellaswag	openbookqa	rte	winogrande	Avg
Llama-2-7B	43.3	76.4	77.7	57.2	31.4	62.8	69.1	59.7
Llama-2-4.7B	34.0	64.6	74.7	49.8	27.4	61.7	66.4	54.1

Downloads last month: 11

Safetensors

Model size

4.71B params

Tensor type

BF16

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for XiaodongChen/Llama-2-4.7B

Base model

meta-llama/Llama-2-7b-hf

Finetuned

(782)

this model

Quantizations

2 models