YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Chemistry Model - Fine-tuned Qwen2.5-3B-Instruct (Fixed)
This is a fine-tuned version of Qwen2.5-3B-Instruct trained for chemistry-related tasks using GRPO (Group Relative Policy Optimization). The model was saved at global step 70.
โ ๏ธ This is a fixed version - the original upload contained distributed tensor metadata that caused loading issues. This version has been properly consolidated.
Model Details
- Base Model: Qwen/Qwen2.5-3B-Instruct
- Architecture: Qwen2ForCausalLM
- Training Algorithm: GRPO with VLLM async rollouts
- Training Step: 70
- Framework: PyTorch + Transformers
- Original checkpoint: ckpts/global_step_70
Training Configuration
This model was trained using the chemistry environment from skyrl-gym with the following key parameters:
- Learning rate: 1.0e-6
- Train batch size: 1024
- Max generate length: 1024
- Environment: ChemGuesser (molecular similarity scoring)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("runrl/chemistry-step-70")
tokenizer = AutoTokenizer.from_pretrained("runrl/chemistry-step-70")
# Example usage for chemistry tasks
prompt = "Predict the molecular structure for the compound with SMILES: "
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Training Environment
This model was specifically trained for chemistry tasks involving molecular structure prediction and similarity scoring.
Technical Notes
- Consolidated from 4-rank FSDP2 checkpoint
- DTensors properly converted to regular PyTorch tensors
- FSDP2 sharded parameters reconstructed into full model
- Compatible with standard Transformers loading
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support