TareksGraveyard
/

Thespian-Qwen2.5-72B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Thespian-Qwen2.5-72B / README.md

Tarek07's picture

Update README.md

68d559a verified about 1 month ago

|

history blame contribute delete

2.07 kB

	---
	base_model:
	- Sao10K/72B-Qwen2.5-Kunou-v1
	- EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
	- zetasepic/Qwen2.5-72B-Instruct-abliterated
	- spow12/ChatWaifu_72B_v2.2
	- Steelskull/Q2.5-MS-Mistoria-72b-v2
	library_name: transformers
	tags:
	- mergekit
	- merge
	license: other
	license_name: qwen
	---
	After some success with my merging my favorite Llama 3 models, I decided to try my hand on some Qwen 2.5 models I have tried and enjoyed. I never quite got fully onto the Qwen bandwagon as I always preferred LLaMa, but a lot of folks swear by Qwen. In my limited experience with Qwen I have enjoyed these models and merged something decent I think. For this merge I went for an aggressive parameter Della method.
	# merge

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the della merge method using [zetasepic/Qwen2.5-72B-Instruct-abliterated](https://huggingface.co/zetasepic/Qwen2.5-72B-Instruct-abliterated) as a base.

	### Models Merged

	The following models were included in the merge:
	* [Sao10K/72B-Qwen2.5-Kunou-v1](https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1)
	* [EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2)
	* [spow12/ChatWaifu_72B_v2.2](https://huggingface.co/spow12/ChatWaifu_72B_v2.2)
	* [Steelskull/Q2.5-MS-Mistoria-72b-v2](https://huggingface.co/Steelskull/Q2.5-MS-Mistoria-72b-v2)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: Sao10K/72B-Qwen2.5-Kunou-v1
	parameters:
	weight: 0.25
	- model: Steelskull/Q2.5-MS-Mistoria-72b-v2
	parameters:
	weight: 0.25
	- model: EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
	parameters:
	weight: 0.25
	- model: spow12/ChatWaifu_72B_v2.2
	parameters:
	weight: 0.25
	merge_method: della
	base_model: zetasepic/Qwen2.5-72B-Instruct-abliterated
	parameters:
	density: 0.7
	epsilon: 0.2
	lambda: 1.1
	window_size: 0.14
	rescale: 1
	dtype: bfloat16
	tokenizer_source: base

	```