--- license: apache-2.0 base_model: - Trelis/Llama-3.2-1B-Instruct-MATH-3ep - huihui-ai/Llama-3.2-1B-Instruct-abliterated - passing2961/Ultron-Summarizer-1B - unsloth/Llama-3.2-1B-Instruct tags: - moe - frankenmoe - merge - mergekit - lazymergekit - Trelis/Llama-3.2-1B-Instruct-MATH-3ep - huihui-ai/Llama-3.2-1B-Instruct-abliterated - passing2961/Ultron-Summarizer-1B - unsloth/Llama-3.2-1B-Instruct --- # DaRuukLLM-Refresh-4x1B-v1 DaRuukLLM-Refresh-4x1B-v1 is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing): * [Trelis/Llama-3.2-1B-Instruct-MATH-3ep](https://huggingface.co/Trelis/Llama-3.2-1B-Instruct-MATH-3ep) * [huihui-ai/Llama-3.2-1B-Instruct-abliterated](https://huggingface.co/huihui-ai/Llama-3.2-1B-Instruct-abliterated) * [passing2961/Ultron-Summarizer-1B](https://huggingface.co/passing2961/Ultron-Summarizer-1B) * [unsloth/Llama-3.2-1B-Instruct](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct) ## 🧩 Configuration ```yaml base_model: unsloth/Llama-3.2-1B-Instruct # Base model for self-attention and layer normalization gate_mode: hidden # Use hidden state representations for MoE gate parameters dtype: bfloat16 # Output data type for the merged model experts: - source_model: Trelis/Llama-3.2-1B-Instruct-MATH-3ep # Expert for math-related tasks positive_prompts: - "Solve the following math problem:" - "Calculate the value of:" - "What is the result of:" - source_model: huihui-ai/Llama-3.2-1B-Instruct-abliterated # Expert for uncensored queries positive_prompts: - "Explain the following controversial topic:" - "Discuss the implications of:" - "Provide an uncensored analysis of:" - source_model: passing2961/Ultron-Summarizer-1B # Expert for summarization tasks positive_prompts: - "Summarize the following text:" - "Provide a concise summary of:" - "Generate a brief overview of:" - source_model: unsloth/Llama-3.2-1B-Instruct # Base model also acts as the chat expert positive_prompts: - "How can I assist you today?" - "What would you like to discuss?" - "Let's have a conversation about:" ``` ## 💻 Usage ```python !pip install -qU transformers bitsandbytes accelerate from transformers import AutoTokenizer import transformers import torch model = "Xiaojian9992024/DaRuukLLM-Refresh-4x1B-v1" tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "text-generation", model=model, model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True}, ) messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}] prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print(outputs[0]["generated_text"]) ```