Mention v0.4 model ; Add Open LLM Leaderboard scores

d391d28 verified about 1 year ago

4.96 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- merge
	base_model:
	- mistralai/Mistral-7B-Instruct-v0.2
	- ehartford/dolphin-2.2.1-mistral-7b
	- SciPhi/SciPhi-Mistral-7B-32k
	- ehartford/samantha-1.2-mistral-7b
	- Arc53/docsgpt-7b-mistral
	- HuggingFaceH4/zephyr-7b-beta
	- meta-math/MetaMath-Mistral-7B
	- Open-Orca/Mistral-7B-OpenOrca
	- openchat/openchat-3.5-1210
	- beowolx/MistralHermes-CodePro-7B-v1
	- TIGER-Lab/MAmmoTH-7B-Mistral
	- teknium/OpenHermes-2.5-Mistral-7B
	- Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
	- mlabonne/NeuralHermes-2.5-Mistral-7B
	---

	# Update 2024-01-03

	Check out our [v0.4 model](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.4) which is based on this and achieves better average score of 71.19 versus 69.66.

	# Model Description

	This is an update to [EmbeddedLLM/Mistral-7B-Merge-14-v0.2](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.2) that removes
	potentially TruthfulQA-contaminated models and non-commercially licensed models:
	1. [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha)
	2. [Q-bert/MetaMath-Cybertron-Starling](https://huggingface.co/Q-bert/MetaMath-Cybertron-Starling)
	3. [v1olet/v1olet_marcoroni-go-bruins-merge-7B](https://huggingface.co/v1olet/v1olet_marcoroni-go-bruins-merge-7B)


	This is an experiment to test merging 14 models using DARE TIES 🦙

	The result is a base model that performs quite well but may need some further chat fine-tuning.

	The 14 models are as follows:
	1. [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
	2. [ehartford/dolphin-2.2.1-mistral-7b](https://huggingface.co/ehartford/dolphin-2.2.1-mistral-7b)
	3. [SciPhi/SciPhi-Mistral-7B-32k](https://huggingface.co/SciPhi/SciPhi-Mistral-7B-32k)
	4. [ehartford/samantha-1.2-mistral-7b](https://huggingface.co/ehartford/samantha-1.2-mistral-7b)
	5. [Arc53/docsgpt-7b-mistral](https://huggingface.co/Arc53/docsgpt-7b-mistral)
	6. [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
	7. [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B)
	8. [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)
	9. [openchat/openchat-3.5-1210](https://huggingface.co/openchat/openchat-3.5-1210)
	10. [beowolx/MistralHermes-CodePro-7B-v1](https://huggingface.co/beowolx/MistralHermes-CodePro-7B-v1)
	11. [TIGER-Lab/MAmmoTH-7B-Mistral](https://huggingface.co/TIGER-Lab/MAmmoTH-7B-Mistral)
	12. [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
	13. [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp)
	14. [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B)

	- base model: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)

	## Open LLM Leaderboard

	\| \| v0.3 \| v0.4 \|
	\|------------\|-------\|-------\|
	\| Average \| 69.66 \| 71.19 \|
	\| ARC \| 65.96 \| 66.81 \|
	\| HellaSwag \| 85.29 \| 86.15 \|
	\| MMLU \| 64.35 \| 65.10 \|
	\| TruthfulQA \| 57.80 \| 58.25 \|
	\| Winogrande \| 78.30 \| 80.03 \|
	\| GSM8K \| 66.26 \| 70.81 \|

	## Chat Template

	We tried ChatML and Llama-2 chat template, but feel free to try other templates.

	## Merge Configuration

	The merge config file for this model is here:

	```yaml
	models:
	- model: mistralai/Mistral-7B-v0.1
	# no parameters necessary for base model
	- model: ehartford/dolphin-2.2.1-mistral-7b
	parameters:
	weight: 0.08
	density: 0.4
	- model: SciPhi/SciPhi-Mistral-7B-32k
	parameters:
	weight: 0.08
	density: 0.4
	- model: ehartford/samantha-1.2-mistral-7b
	parameters:
	weight: 0.08
	density: 0.4
	- model: Arc53/docsgpt-7b-mistral
	parameters:
	weight: 0.08
	density: 0.4
	- model: HuggingFaceH4/zephyr-7b-beta
	parameters:
	weight: 0.08
	density: 0.4
	- model: meta-math/MetaMath-Mistral-7B
	parameters:
	weight: 0.08
	density: 0.4
	- model: Open-Orca/Mistral-7B-OpenOrca
	parameters:
	weight: 0.08
	density: 0.4
	- model: openchat/openchat-3.5-1210
	parameters:
	weight: 0.08
	density: 0.4
	- model: beowolx/MistralHermes-CodePro-7B-v1
	parameters:
	weight: 0.08
	density: 0.4
	- model: TIGER-Lab/MAmmoTH-7B-Mistral
	parameters:
	weight: 0.08
	density: 0.4
	- model: teknium/OpenHermes-2.5-Mistral-7B
	parameters:
	weight: 0.08
	density: 0.4
	- model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
	parameters:
	weight: 0.08
	density: 0.4
	- model: mlabonne/NeuralHermes-2.5-Mistral-7B
	parameters:
	weight: 0.08
	density: 0.4
	- model: mistralai/Mistral-7B-Instruct-v0.2
	parameters:
	weight: 0.08
	density: 0.5
	merge_method: dare_ties
	base_model: mistralai/Mistral-7B-v0.1
	parameters:
	int8_mask: true
	dtype: bfloat16

	```