Triangle104
/

Dumpling-Qwen2.5-1.5B-v2-Q5_K_S-GGUF

Model card Files Files and versions Community

Triangle104 commited on Feb 4

Commit

999dbfd

·

verified ·

1 Parent(s): f850b7a

Update README.md

Files changed (1) hide show

README.md +63 -0

README.md CHANGED Viewed

@@ -24,6 +24,69 @@ tags:
 This model was converted to GGUF format from [`nbeerbower/Dumpling-Qwen2.5-1.5B-v2`](https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-1.5B-v2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-1.5B-v2) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`nbeerbower/Dumpling-Qwen2.5-1.5B-v2`](https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-1.5B-v2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-1.5B-v2) for more details on the model.
+---
+nbeerbower/EVA-abliterated-TIES-Qwen2.5-1.5B finetuned on:
+    nbeerbower/GreatFirewall-DPO
+    nbeerbower/Schule-DPO
+    nbeerbower/Purpura-DPO
+    nbeerbower/Arkhaios-DPO
+    jondurbin/truthy-dpo-v0.1
+    antiven0m/physical-reasoning-dpo
+    flammenai/Date-DPO-NoAsterisks
+    flammenai/Prude-Phi3-DPO
+    Atsunori/HelpSteer2-DPO (1,000 samples)
+    jondurbin/gutenberg-dpo-v0.1
+    nbeerbower/gutenberg2-dpo
+    nbeerbower/gutenberg-moderne-dpo.
+Method
+QLoRA ORPO tune with 2x RTX 3090 for 2 epochs.
+# QLoRA config
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch_dtype,
+    bnb_4bit_use_double_quant=True,
+)
+# LoRA config
+peft_config = LoraConfig(
+    r=64,
+    lora_alpha=64,
+    lora_dropout=0.05,
+    bias="none",
+    task_type="CAUSAL_LM",
+    target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
+)
+# Training config
+orpo_args = ORPOConfig(
+    run_name=new_model,
+    learning_rate=2e-5,
+    lr_scheduler_type="linear",
+    max_length=2048,
+    max_prompt_length=1024,
+    max_completion_length=1024,
+    beta=0.1,
+    per_device_train_batch_size=1,
+    per_device_eval_batch_size=1,
+    gradient_accumulation_steps=8,
+    optim="paged_adamw_8bit",
+    num_train_epochs=2,
+    evaluation_strategy="steps",
+    eval_steps=0.2,
+    logging_steps=1,
+    warmup_steps=10,
+    max_grad_norm=10,
+    report_to="wandb",
+    output_dir="./results/",
+    bf16=True,
+)
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)