Fine-tuning on actual perturbation data
Thanks for the valuable work and prompt responses in the discussion!
I was wondering if you have any suggestions for fine-tuning the model on real perturbation data? Is the current architecture suitable for simultaneous training on control and perturbed? or alternatively do you suggest computing features from each condition separately and build a model on top of that? or maybe any other ideas?
Thank you for your interest in Geneformer! You can definitely train on multiple cell states. It depends on your scientific question but one way to approach it would be to fine-tune the model to distinguish between the control and perturbed states and then perform in silico perturbations to identify genes that are predicted to move genes from one state to the other.
Thanks, training on multiple states is a great Suggestion and fits my purpose! Do you already have tutorial for training on different cell states? How do you suggest approaching that?
We strongly recommend hyperparameter tuning for fine-tuning, so the best example to follow would be the one for disease classification, but substituting the normal vs. disease states with your cell states of interest.