|
--- |
|
title: README |
|
emoji: 🐨 |
|
colorFrom: purple |
|
colorTo: blue |
|
sdk: static |
|
pinned: true |
|
license: bsd-3-clause |
|
short_description: Ensemble of experts for cell-type annotation |
|
thumbnail: >- |
|
https://cdn-uploads.huggingface.co/production/uploads/63d7697f2e397d9f8e30e677/tvABibiml6K2sccfXLybG.png |
|
--- |
|
# **popV** |
|
|
|
Welcome to the **popV** framework. We provide state-of-the-art performance in cell-type label transfer using an ensemble of experts approach. We provide here pre-trained |
|
models to transfer cell-types to your own query dataset. Cell-type definition is a tedious process. Using reference data can significantly accelerate this process. |
|
By using several tools for label transfer, we provide a certainty score that is well calibrated and allows to detect cell-types, where automatic annotation has high |
|
uncertainty. We recommend to manually check transferred cell-type labels by plotting marker or differentially expressed genes before blindly trusting them. |
|
This is an open science initiative, please contribute your own models to allow the single-cell community to leverage your reference datasets by asking in our [GitHub |
|
repository](https://github.com/YosefLab/popV) to add your dataset. |
|
|
|
--- |
|
|
|
## **Model Overview** |
|
popV trains up to 9 different algorithms for automatic label transfer and computes a consensus score. We provide an automatic report. To learn how to apply popV to your |
|
own dataset, please refer to our [tutorial]() |
|
|
|
### Algorithms |
|
|
|
Currently implemented algorithms are: |
|
|
|
- K-nearest neighbor classification after dataset integration with [BBKNN](https://github.com/Teichlab/bbknn) |
|
- K-nearest neighbor classification after dataset integration with [SCANORAMA](https://github.com/brianhie/scanorama) |
|
- K-nearest neighbor classification after dataset integration with [scVI](https://github.com/scverse/scvi-tools) |
|
- K-nearest neighbor classification after dataset integration with [Harmony](https://github.com/lilab-bcb/harmony-pytorch) |
|
- Random forest classification |
|
- Support vector machine classification |
|
- [OnClass](https://github.com/wangshenguiuc/OnClass) cell type classification |
|
- [scANVI](https://github.com/scverse/scvi-tools) label transfer |
|
- [Celltypist](https://www.celltypist.org) cell type classification |
|
|
|
--- |
|
|
|
## **Key Applications** |
|
The purpose of these models is to perform cell-type label transfer. |
|
We provide models with (CUML support)[collection] for large-scale reference mapping and (without CUML support)[collection] if no GPU is available. PopV without GPU scales |
|
well to 100k cells. PopV has three levels of prediction complexities: |
|
|
|
- retrain will train all classifiers from scratch. For 50k cells this takes up to an hour of computing time using a GPU. |
|
- inference will use pretrained classifiers to annotate query as well as reference cells and construct a joint embedding using all integration methods from above. For 50k cells this takes in our hands up to half an hour of computing time using a GPU. |
|
- fast will use only methods with pretrained classifiers to annotate only query cells. For 50k cells this takes 5 minutes without a GPU (without UMAP embedding). |
|
|
|
--- |
|
|
|
## **Publications** |
|
- **[Original popV paper](https://www.nature.com/articles/s41588-024-01993-3)**: |
|
- Published in *Nature Genetics*, this paper introduces popV and benchmarks it. |
|
|
|
## **Contact** |
|
- GitHub: [https://github.com/YosefLab/popV](https://github.com/YosefLab/popV) |
|
- User questions: [Discourse](https://discourse.scverse.org) |
|
|
|
|
|
<!--- |
|
- **[MultiVI](https://docs.scvi-tools.org/en/stable/user_guide/models/multivi.html)**: |
|
- A multi-modal model for joint analysis of RNA, ATAC and protein data, enabling integrative insights from diverse omics data. |
|
--> |