|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
# ***Mol-MoE***: Training Preference-Guided Routers for Molecule Generation |
|
*Diego Calanzone (1, 2), Pierluca D'Oro (2), Pierre-Luc Bacon (1, 2)* <br> |
|
*(1) Universite de Montreal, (2) Mila Quebec AI Institute* <br> |
|
**arXiv**: https://arxiv.org/abs/2502.05633 |
|
|
|
**Abstract**: Recent advances in language models have enabled framing molecule generation as sequence modeling. However, existing approaches often rely on single-objective reinforcement learning, limiting their applicability to real-world drug design, where multiple competing properties must be optimized. Traditional multi-objective reinforcement learning (MORL) methods require costly retraining for each new objective combination, making rapid exploration of trade-offs impractical. To overcome these limitations, we introduce Mol-MoE, a mixture-of-experts (MoE) architecture that enables efficient test-time steering of molecule generation without retraining. Central to our approach is a preference-based router training objective that incentivizes the router to combine experts in a way that aligns with user-specified trade-offs. This provides improved flexibility in exploring the chemical property space at test time, facilitating rapid trade-off exploration. Benchmarking against state-of-the-art methods, we show that Mol-MoE achieves superior sample quality and steerability. |
|
|
|
|
|
## How to use this model |
|
This LM is fine-tuned to generate molecules in the SMILES format wrt. desired properties. |
|
For unconditioned SMILES generation, use the BOS token `<s>`. <br> |
|
For conditioned generation, please refer to the paper and the official codebase to derive different conditioned models. <br> |
|
This model is the merging result of 5 fine-tuned versions (`JNK3, DRD2, GSK3B, CYP2D6, CYP2D19`) with equal interpolation weight: *w_i = 0.2*. |
|
|
|
An example of the generation pipeline: |
|
``` |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import re |
|
|
|
# Setup |
|
device = "cuda" |
|
tokenizer = AutoTokenizer.from_pretrained("ddidacus/RS-mol-llama-1b") |
|
model = AutoModelForCausalLM.from_pretrained("ddidacus/RS-mol-llama-1b") |
|
generation_kwargs = { |
|
"max_new_tokens": 128, |
|
"min_length": -1, |
|
"top_k": 0.0, |
|
"top_p": 0.9, |
|
"do_sample": True, |
|
"pad_token_id": tokenizer.eos_token_id, |
|
"temperature": 1.0 |
|
} |
|
|
|
# Inference |
|
query = "<s>" |
|
toks = tokenizer([query], return_tensors="pt")["input_ids"].to(device) |
|
output = model.generate(toks, **generation_kwargs) |
|
output = tokenizer.batch_decode(output) |
|
|
|
# Parsing |
|
filter = r'<s>(.*?)</s>' |
|
molecule = re.findall(filter, output[0], re.DOTALL) |
|
``` |
|
|
|
### Model Description |
|
This model is a fine-tuned version of LLaMa 3.2 1B through two stages: |
|
1. Fine-tuning on ~3.5M molecules extracted from: ZINC 250K, MOSES, CHEMBL |
|
2. RLHF-tuning using RLOO on 5 distinct reward functions from PyTDC [1] |
|
|
|
- **Developed by:** Diego Calanzone ([email protected]) |
|
- **Model type:** Decoder-only Transformer |
|
- **Finetuned from model [optional]:** LLaMA 3.2 1B |
|
|
|
Read the paper for further details. |
|
|
|
### Sources |
|
[1] https://tdcommons.ai/single_pred_tasks/overview |
|
|
|
<!-- |
|
### Model Sources [optional] |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo [optional]:** [More Information Needed] |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
[More Information Needed] |
|
|
|
### Downstream Use [optional] |
|
|
|
[More Information Needed] |
|
|
|
### Out-of-Scope Use |
|
|
|
[More Information Needed] |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
[More Information Needed] |
|
|
|
### Recommendations |
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
[More Information Needed] |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
[More Information Needed] |
|
|
|
### Training Procedure |
|
|
|
#### Preprocessing [optional] |
|
|
|
[More Information Needed] |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** [More Information Needed] |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Evaluation |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
[More Information Needed] |
|
|
|
#### Factors |
|
|
|
[More Information Needed] |
|
|
|
#### Metrics |
|
|
|
[More Information Needed] |
|
|
|
### Results |
|
|
|
[More Information Needed] |
|
|
|
#### Summary |
|
|
|
|
|
|
|
## Model Examination [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Environmental Impact |
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
- **Hardware Type:** [More Information Needed] |
|
- **Hours used:** [More Information Needed] |
|
- **Cloud Provider:** [More Information Needed] |
|
- **Compute Region:** [More Information Needed] |
|
- **Carbon Emitted:** [More Information Needed] |
|
|
|
## Technical Specifications [optional] |
|
|
|
### Model Architecture and Objective |
|
|
|
[More Information Needed] |
|
|
|
### Compute Infrastructure |
|
|
|
[More Information Needed] |
|
|
|
#### Hardware |
|
|
|
[More Information Needed] |
|
|
|
#### Software |
|
|
|
[More Information Needed] |
|
|
|
## Citation [optional] |
|
|
|
**BibTeX:** |
|
|
|
[More Information Needed] |
|
|
|
**APA:** |
|
|
|
[More Information Needed] |
|
|
|
## Glossary [optional] |
|
|
|
[More Information Needed] |
|
|
|
## More Information [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Authors [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Contact |
|
|
|
[More Information Needed] --> |