This is a toy model using CoT-sft with GRPO.
Usage
tokenizer = AutoTokenizer.from_pretrained("yixuantt/Qwen2.5-3B-R1-Finance")
model = AutoModelForCausalLM.from_pretrained("yixuantt/Qwen2.5-3B-R1-Finance",
torch_dtype = torch.bfloat16,
device_map = "auto"
)
model.eval()
print(model)
def generate(text):
conv = [{"role": "system",
"content": "You are a helpful AI Assistant that provides well-reasoned and detailed responses. You first think about the reasoning process as an internal monologue and then provide the user with the answer."},{"role": "user", "content": text}]
prompt = tokenizer.apply_chat_template(conversation=conv, tokenize=False, add_generation_prompt=True)
encoded = tokenizer(prompt, return_tensors="pt")
generate_params = dict(
max_new_tokens=1024,
do_sample=True,
top_k=20,
)
with torch.no_grad():
generation_output = model.generate(input_ids=encoded.input_ids.cuda(),
attention_mask=encoded.attention_mask.cuda(),
tokenizer=tokenizer,
**generate_params)
generation_output = generation_output[:, encoded.input_ids.shape[1]:]
out = tokenizer.decode(generation_output[0], skip_special_tokens=True)
# print(out)
return out
- Downloads last month
- 23
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.