Xiaojian9992024
/

DaRuukLLM-Refresh-4x1B-v1

Mixture of Experts

Trelis/Llama-3.2-1B-Instruct-MATH-3ep

huihui-ai/Llama-3.2-1B-Instruct-abliterated

passing2961/Ultron-Summarizer-1B

unsloth/Llama-3.2-1B-Instruct

Model card Files Files and versions

DaRuukLLM-Refresh-4x1B-v1 / README.md

Xiaojian9992024's picture

Xiaojian9992024

Upload folder using huggingface_hub

0b2e266 verified 8 months ago

|

history blame contribute delete

3.01 kB

	---
	license: apache-2.0
	base_model:
	- Trelis/Llama-3.2-1B-Instruct-MATH-3ep
	- huihui-ai/Llama-3.2-1B-Instruct-abliterated
	- passing2961/Ultron-Summarizer-1B
	- unsloth/Llama-3.2-1B-Instruct
	tags:
	- moe
	- frankenmoe
	- merge
	- mergekit
	- lazymergekit
	- Trelis/Llama-3.2-1B-Instruct-MATH-3ep
	- huihui-ai/Llama-3.2-1B-Instruct-abliterated
	- passing2961/Ultron-Summarizer-1B
	- unsloth/Llama-3.2-1B-Instruct
	---

	# DaRuukLLM-Refresh-4x1B-v1

	DaRuukLLM-Refresh-4x1B-v1 is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
	* [Trelis/Llama-3.2-1B-Instruct-MATH-3ep](https://huggingface.co/Trelis/Llama-3.2-1B-Instruct-MATH-3ep)
	* [huihui-ai/Llama-3.2-1B-Instruct-abliterated](https://huggingface.co/huihui-ai/Llama-3.2-1B-Instruct-abliterated)
	* [passing2961/Ultron-Summarizer-1B](https://huggingface.co/passing2961/Ultron-Summarizer-1B)
	* [unsloth/Llama-3.2-1B-Instruct](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct)

	## 🧩 Configuration

	```yaml
	base_model: unsloth/Llama-3.2-1B-Instruct # Base model for self-attention and layer normalization
	gate_mode: hidden # Use hidden state representations for MoE gate parameters
	dtype: bfloat16 # Output data type for the merged model

	experts:
	- source_model: Trelis/Llama-3.2-1B-Instruct-MATH-3ep # Expert for math-related tasks
	positive_prompts:
	- "Solve the following math problem:"
	- "Calculate the value of:"
	- "What is the result of:"

	- source_model: huihui-ai/Llama-3.2-1B-Instruct-abliterated # Expert for uncensored queries
	positive_prompts:
	- "Explain the following controversial topic:"
	- "Discuss the implications of:"
	- "Provide an uncensored analysis of:"

	- source_model: passing2961/Ultron-Summarizer-1B # Expert for summarization tasks
	positive_prompts:
	- "Summarize the following text:"
	- "Provide a concise summary of:"
	- "Generate a brief overview of:"

	- source_model: unsloth/Llama-3.2-1B-Instruct # Base model also acts as the chat expert
	positive_prompts:
	- "How can I assist you today?"
	- "What would you like to discuss?"
	- "Let's have a conversation about:"
	```

	## 💻 Usage

	```python
	!pip install -qU transformers bitsandbytes accelerate

	from transformers import AutoTokenizer
	import transformers
	import torch

	model = "Xiaojian9992024/DaRuukLLM-Refresh-4x1B-v1"

	tokenizer = AutoTokenizer.from_pretrained(model)
	pipeline = transformers.pipeline(
	"text-generation",
	model=model,
	model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
	)

	messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
	prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
	print(outputs[0]["generated_text"])
	```