DeepSeek‑Prover‑V2‑7B · LoRA Adapter

This repository hosts a LoRA adapter fine‑tuned on top of deepseek-ai/DeepSeek-Prover-V2-7B using 🤗 trl’s SFTTrainer.


Training Setup

Hyper‑parameter Value
Learning rate 2 × 10⁻⁴
Batch size / device 16
Gradient accumulation steps 1
Effective batch size 16
Epochs 1
Scheduler linear
Warm‑up ratio 0.03
Weight decay 0.01
Seed 42
Sequence length 1 792
Flash‑Attention‑2 ✅ (use_flash_attention_2=True)

LoRA configuration

Setting Value
Rank r 16
α 32
Dropout 0.05
Target modules all linear layers
Modules saved embed_tokenslm_head
Bias none

RoPE scalingYARN, factor = 16.0, β_fast = 32.0, β_slow = 1.0

Training was performed on GPUs with bfloat16 precision (torch_dtype=torch.bfloat16).


Loss Curve

Loss curves


Usage

from transformers import AutoTokenizer, AutoPeftModelForCausalLM

model = AutoPeftModelForCausalLM.from_pretrained(
    "your‑username/DeepSeek-Prover-V2-7B-conjecture-chat-new-config-20250724_0955",
    trust_remote_code=True,
)
tok = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Prover-V2-7B", trust_remote_code=True)

prompt = "Prove that the sum of two even numbers is even."
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=128)
print(tok.decode(out[0], skip_special_tokens=True))
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for haielab/DeepSeek-Prover-V2-7B-conjecture-base-FineTune-20250724_0955

Adapter
(2)
this model