DeepSeek-Prover-V2-7B LoRA (v1)

Rank-16 LoRA adapter fine-tuned from deepseek-ai/DeepSeek-Prover-V2-7B on a conversational Lean theorem-proving corpus.


Model Details

Field Value
Developed by HAIELab
Adapter type LoRA (r = 16, α = 32, dropout = 0.05)
Precision bf16 · Flash-Attention v2
Context length 1 792 tokens (left-padding)
Hardware 1 × H100 80 GB

Intended Uses

Interactive Lean proof hints, conjecture generation, research on automated theorem proving.


Hyper-parameters

Parameter Value
Epochs 1
Max seq length 1 792
Batch size 16 train / 16 eval
Grad accum 1
LR schedule Cosine, base 2 × 10⁻⁴, warm-up 3 %
Weight decay 0.01
Optimiser AdamW
LoRA r 16 · α 32 · dropout 0.05
Precision bf16 mixed, gradient-checkpointing
Logging every 50 steps
Evaluation once per epoch

Results (1 epoch)

Metric Value
Final train loss 1.2297
Final eval loss 1.2185
Mean token accuracy 0.696
Tokens processed 156 814 576
Train runtime 10 h 17 m
Train steps/s 0.295

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import peft, torch

# --- Hub repo IDs -----------------------------------------------------------
adapter_id = "haielab/DeepSeek-Prover-V2-7B-conjecture-base-FineTune-new-config"
base_id    = "deepseek-ai/DeepSeek-Prover-V2-7B"

# --- 1️⃣  Tokenizer ----------------------------------------------------------
tok = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
tok.padding_side, tok.pad_token = "left", tok.eos_token     # DeepSeek expects left-padding

# --- 2️⃣  Load base model ----------------------------------------------------
base = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
    device_map="auto",          # auto-place on available GPU(s)
)

# --- 3️⃣  Inject LoRA adapter ------------------------------------------------
model = peft.PeftModel.from_pretrained(base, adapter_id)
model.eval()

# --- 4️⃣  Generate proof continuation ---------------------------------------
prompt  = "<user>Theorem foo …</user><assistant>"
inputs  = tok(prompt, return_tensors="pt").to(model.device)

out = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        top_p=0.9,
)
print(tok.decode(out[0], skip_special_tokens=True))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for haielab/DeepSeek-Prover-V2-7B-conjecture-base-FineTune-new-config

Adapter
(2)
this model