DeepSeek‑Prover‑V2‑7B · LoRA Adapter
This repository hosts a LoRA adapter fine‑tuned on top of deepseek-ai/DeepSeek-Prover-V2-7B
using 🤗 trl
’s SFTTrainer
.
Training Setup
Hyper‑parameter | Value |
---|---|
Learning rate | 2 × 10⁻⁴ |
Batch size / device | 16 |
Gradient accumulation steps | 1 |
Effective batch size | 16 |
Epochs | 1 |
Scheduler | linear |
Warm‑up ratio | 0.03 |
Weight decay | 0.01 |
Seed | 42 |
Sequence length | 1 792 |
Flash‑Attention‑2 | ✅ (use_flash_attention_2=True ) |
LoRA configuration
Setting | Value |
---|---|
Rank r | 16 |
α | 32 |
Dropout | 0.05 |
Target modules | all linear layers |
Modules saved | embed_tokens , lm_head |
Bias | none |
RoPE scaling: YARN
, factor = 16.0, β_fast = 32.0, β_slow = 1.0
Training was performed on GPUs with bfloat16 precision (torch_dtype=torch.bfloat16
).
Loss Curve
Usage
from transformers import AutoTokenizer, AutoPeftModelForCausalLM
model = AutoPeftModelForCausalLM.from_pretrained(
"your‑username/DeepSeek-Prover-V2-7B-conjecture-chat-new-config-20250724_0955",
trust_remote_code=True,
)
tok = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Prover-V2-7B", trust_remote_code=True)
prompt = "Prove that the sum of two even numbers is even."
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=128)
print(tok.decode(out[0], skip_special_tokens=True))
- Downloads last month
- 3
Model tree for haielab/DeepSeek-Prover-V2-7B-conjecture-base-FineTune-20250724_0955
Base model
deepseek-ai/DeepSeek-Prover-V2-7B