L3.1-70b-Ginny / README.md
KaraKaraWitch's picture
Update README.md
aa6c9bf verified
|
raw
history blame
2.59 kB
metadata
base_model:
  - NousResearch/Hermes-3-Llama-3.1-70B
  - Fizzarolli/L3.1-70b-glitz-v0.2
  - cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
  - Sao10K/L3-70B-Euryale-v2.1
tags:
  - merge
  - mergekit
  - lazymergekit
  - NousResearch/Hermes-3-Llama-3.1-70B
  - Fizzarolli/L3.1-70b-glitz-v0.2
  - cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
  - Sao10K/L3-70B-Euryale-v2.1

L3.1-70b-Ginny

Like with everything, I have to start somewhere right? As such this model is named Ginny.

L3.1-70b-Ginny is a merge of the following models using LazyMergekit:

Using Hermes as a base, I mixed in Glitz and Euryale which I both liked. I think I prefer Glitz more actually.

Additionally I decided to throw in cyberagent's Japanese Instruct in the hopes it will boost Japanese capabilities.

(Though on recommendations from others, I've steeled myself to never use Hermes as base ever again.)

🧩 Configuration


models:
  - model: NousResearch/Hermes-3-Llama-3.1-70B
    parameters:
      density: 0.33
      weight: 0.25
  - model: Fizzarolli/L3.1-70b-glitz-v0.2
    parameters:
      density: 0.7
      weight: 0.5
  - model: cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
    parameters:
      density: 0.5
      weight: 0.25
  - model: Sao10K/L3-70B-Euryale-v2.1
    parameters:
      density: 0.7
      weight: 0.5

merge_method: ties
base_model: NousResearch/Hermes-3-Llama-3.1-70B
parameters:
  normalize: true
dtype: bfloat16

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "KaraKaraWitch/L3.1-70b-Ginny"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])