--- base_model: - nitky/Swallow-70b-RP - karakuri-ai/karakuri-lm-70b-chat-v0.1 tags: - mergekit - merge language: - en - ja library_name: transformers pipeline_tag: text-generation license: cc-by-nc-sa-4.0 model_type: llama --- # Swallow-70b-RP-EX **Important Notice:** For personal and academic use only. ## Description This model is suitable for role-playing and storytelling, and has great multi-turn chat capabilities. This is probably due to the extremely high multi-turn chat performance of [karakuri-ai/karakuri-lm-70b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1/tree/main). Thank you for providing such a wonderful model. This was created for personal and academic use only. This merge model uses only fine-tune models of Llama2, but some of the models used include those whose licenses for commercial use are unclear. If there is a license problem, the rights holder should contact me directly. No license changes will be made due to contact from others. In particular, [karakuri-ai/karakuri-lm-70b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1/tree/main) is currently distributed under cc-by-sa-4.0, and licensing discussions are taking place [here](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1/tree/main). However, since I am not a legal expert, I had to decide on this license according to their license. ## Test environment This model was tested using [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main). I use preset `simple-1` and `Null preset` for Generation. ### Recommendation Use `Null preset` and modified temperature settings: - temperature: 0.3 - top_p: 1.0 - repetition_penalty: 1.0 - top_k: 0 As a result of testing, lower temperature and smaller top_k may give better outputs. ### Tested `temperature` Range - temperature: 0.3 - 1.0 ### Tested `repetition_penalty` Range - repetition_penalty: 1.0 - 1.15 ## Prompt template ### Swallow Style (Alpaca format) ``` 以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。リクエストを適切に完了するための回答を記述してください。 ### 指示: {instruction} ### 応答: ``` Although not fully tested, [karakuri-ai/karakuri-lm-70b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1/tree/main), [Doctor-Shotgun/lzlv-limarpv3-l2-70b](Doctor-Shotgun/lzlv-limarpv3-l2-70b) and [alac/Waxwing-Storytelling-70B-LoRA](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA) prompt styles are also available. ## Use the instruct model ``` import torch from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "nitky/Swallow-70b-RP-EX" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, device_map="auto", load_in_4bit = True) PROMPT_DICT = { "prompt_input": ( "以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。" "リクエストを適切に完了するための回答を記述してください。\n\n" "### 指示:\n{instruction}\n\n### 入力:\n{input}\n\n### 応答:" ), "prompt_no_input": ( "以下に、あるタスクを説明する指示があります。" "リクエストを適切に完了するための回答を記述してください。\n\n" "### 指示:\n{instruction}\n\n### 応答:" ), } def create_prompt(instruction, input=None): """ Generates a prompt based on the given instruction and an optional input. If input is provided, it uses the 'prompt_input' template from PROMPT_DICT. If no input is provided, it uses the 'prompt_no_input' template. Args: instruction (str): The instruction describing the task. input (str, optional): Additional input providing context for the task. Default is None. Returns: str: The generated prompt. """ if input: # Use the 'prompt_input' template when additional input is provided return PROMPT_DICT["prompt_input"].format(instruction=instruction, input=input) else: # Use the 'prompt_no_input' template when no additional input is provided return PROMPT_DICT["prompt_no_input"].format(instruction=instruction) # Example usage instruction_example = "以下のトピックに関する詳細な情報を提供してください。" input_example = "東京工業大学の主なキャンパスについて教えてください" prompt = create_prompt(instruction_example, input_example) input_ids = tokenizer.encode( prompt, add_special_tokens=False, return_tensors="pt" ) tokens = model.generate( input_ids.to(device=model.device), max_new_tokens=200, temperature=0.3, do_sample=True, ) out = tokenizer.decode(tokens[0], skip_special_tokens=True) print(out) ``` ## Merge Details ### Merge Method This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) and the SLERP merge method using [tokyotech-llm/Swallow-70b-instruct-hf](https://huggingface.co/tokyotech-llm/Swallow-70b-instruct-hf) as a base. ### Models Merged The following models were included in the merge: * [karakuri-ai/karakuri-lm-70b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1) * [GOAT-AI/GOAT-70B-Storytelling](https://huggingface.co/GOAT-AI/GOAT-70B-Storytelling) * [dreamgen/opus-v0.5-70b](https://huggingface.co/dreamgen/opus-v0.5-70b) * [Doctor-Shotgun/lzlv-limarpv3-l2-70b](Doctor-Shotgun/lzlv-limarpv3-l2-70b) * [LoRA] [alac/Waxwing-Storytelling-70B-LoRA](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA) ### Configuration Command example: ```bash # please change the path and options according to your environment mergekit-mega --cuda Swallow-70b-RP-EX.yml ~/text-generation-webui/models ``` The following YAML configuration was used to produce this model: ```yaml models: - model: nitky/Swallow-70b-RP # no parameters necessary for base model - model: karakuri-ai/karakuri-lm-70b-chat-v0.1 parameters: density: 1 weight: - filter: mlp value: 0.1 - filter: self_attn value: 0.4 - value: 0 # fallback for rest of tensors. merge_method: dare_ties base_model: nitky/Swallow-70b-RP dtype: bfloat16 tokenizer_source: union name: Swallow-70b-RP-EX-base --- models: - model: nitky/Swallow-70b-RP # no parameters necessary for base model - model: karakuri-ai/karakuri-lm-70b-chat-v0.1 parameters: density: 1 weight: - filter: mlp value: [0.4, 0.1, 0.4, 0.1, 0.4, 0.1, 0.4, 0.1, 0.1] - filter: self_attn value: [0.4, 0.4, 0.1, 0.4, 0.1, 0.4, 0.1, 0.4, 0.4] - value: 0 # fallback for rest of tensors. merge_method: dare_ties base_model: nitky/Swallow-70b-RP dtype: bfloat16 tokenizer_source: union name: Swallow-70b-RP-EX-flavor --- slices: - sources: - model: Swallow-70b-RP-EX-base layer_range: [0, 80] - model: Swallow-70b-RP-EX-flavor layer_range: [0, 80] merge_method: slerp base_model: Swallow-70b-RP-EX-base parameters: t: # model stabilization - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 # fallback for rest of tensors dtype: bfloat16 name: Swallow-70b-RP-EX ```