H0-mini / README.md

minor corrections

4c18a46 14 days ago

8.86 kB

	---
	tags:
	- image-feature-extraction
	- timm
	- pathology
	- histology
	- medical imaging
	- self-supervised learning
	- vision transformer
	- foundation model
	library_name: timm
	license: cc-by-nc-nd-4.0
	extra_gated_prompt: >-
	- This model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution.

	- Any commercial use, sale, or other monetization of the H0-mini model and its derivatives, which include models trained on outputs from the H0-mini model or datasets created from the H0-mini model, is prohibited and requires prior approval.

	- Please note that the primary email used to sign up for your Hugging Face account must match your institutional email to receive approval. By downloading the model, you attest that all information (affiliation, research use) is correct and up-to-date. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. If another user within your organization wishes to use the H0-mini model, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying model.

	- This model is provided “as-is” without warranties of any kind, express or implied. This model has not been reviewed, certified, or approved by any regulatory body, including but not limited to the FDA (U.S.), EMA (Europe), MHRA (UK), or other medical device authorities. Any application of this model in healthcare or biomedical settings must comply with relevant regulatory requirements and undergo independent validation. Users assume full responsibility for how they use this model and any resulting consequences. The authors, contributors, and distributors disclaim any liability for damages, direct or indirect, resulting from model use. Users are responsible for ensuring compliance with data protection regulations (e.g., GDPR, HIPAA) when using it in research that involves patient data.

	- If you are a commercial entity, please contact us at hello [at] bioptimus.com to discuss licensing options.
	extra_gated_fields:
	Full name (first and last): text
	Current affiliation (no abbreviations): text
	Type of Affiliation:
	type: select
	options:
	- Academia
	- Industry
	- label: Other
	value: other
	Current and official institutional email (this must match your primary email in your Hugging Face account, @gmail/@hotmail/@qq email domains will be denied): text
	Main use-case:
	type: select
	options:
	- Models benchmarking on various tasks
	- Biomarker Discovery
	- Diagnostics
	- Pathology workflows acceleration (cell & tissue segmentation etc)
	- label: Other
	value: other
	Please add information on your intended research use: text
	I agree to all terms outlined above: checkbox
	I agree not to distribute the model, if another user within your organization wishes to use the H0-mini model, they must register as an individual user: checkbox
	I agree to use this model for non-commercial, academic purposes only: checkbox
	I am interested by receiving updates from Bioptimus:
	type: checkbox
	optional: true
	---

	# Model card for H0-mini

	<p align="center">
	<img src="./logo.png" width="400" height="144" />
	</p>

	`H0-mini` is a lightweight foundation model for histology developed by [Owkin](https://www.owkin.com/) and [Bioptimus](https://www.bioptimus.com/).

	The model is a Vision Transformer Base/14 distilled from `H-optimus-0` [1] (ViT-g/14) with DINOv2 [2] self-supervised distillation method on `PanCancer40M`, a set of 43 million histology tiles extracted from 6,093 histology slides of [TCGA](https://portal.gdc.cancer.gov/).

	`H0-mini` achieves comparable performance to current histology foundation models at a significantly reduced inference cost. It also demonstrates strong robustness to variations in staining and scanning protocols. Please refer to the [ArXiv preprint](https://arxiv.org/abs/2501.16239) for additional details.

	![](https://huggingface.co/bioptimus/H0-mini/resolve/main/figure.png)
	Figure: Assessment of model robustness to staining and scanning conditions in PLISM dataset [3] - Median top-10 accuracy vs. mean cosine similarity was computed for each extractor over 4,095 slide pairs. For both axes, higher values indicate more robust models.


	### How to use it to extract features.

	`H0-mini` can be used with or without fine-tuning on different downstream applications, such as slide-level classification using multiple-instance learning algorithms (e.g. using ABMIL [4]).

	The following code snippet allows you to extract features from histology images using `H0-Mini`.

	We recommend to use the CLS token (`cls_features`) as input features for downstream tasks.
	The concatenation of the CLS token features with the average of patch token features may bring some improvements on some tasks (`concatenated_features`).

	```python
	from huggingface_hub import login
	import torch
	import timm
	from timm.data import resolve_data_config
	from timm.data.transforms_factory import create_transform
	from torchvision import transforms


	# Login to the Hugging Face hub, using your user access token that can be found here:
	# https://huggingface.co/settings/tokens.
	login()

	model = timm.create_model(
	"hf-hub:bioptimus/H0-mini",
	pretrained=True,
	mlp_layer=timm.layers.SwiGLUPacked,
	act_layer=torch.nn.SiLU,
	)
	model.to("cuda")
	model.eval()

	transform = create_transform(**resolve_data_config(model.pretrained_cfg, model=model))

	input = torch.rand(3, 224, 224)
	input = transforms.ToPILImage()(input)

	# We recommend using mixed precision for faster inference.
	with torch.autocast(device_type="cuda", dtype=torch.float16):
	with torch.inference_mode():
	output = model(transform(input).unsqueeze(0).to("cuda")) # (1, 261, 768)
	# CLS token features (1, 768):
	cls_features = output[:, 0]
	# Patch token features (1, 256, 768):
	patch_token_features = output[:, model.num_prefix_tokens :]
	# Concatenate the CLS token features with the mean of the patch token
	# features (1, 1536):
	concatenated_features = torch.cat(
	[cls_features, patch_token_features.mean(1)], dim=-1
	)

	assert cls_features.shape == (1, 768)
	assert patch_token_features.shape == (1, 256, 768)
	assert concatenated_features.shape == (1, 1536)
	```

	These features can then be used for downstream applications such as ROI classification (via linear or k-NN probing),
	slide classification (via multiple instance learning), segmentation (via ViT-Adapter for instance), etc.

	## Software Dependencies.

	- torch>==2.0.0: https://pytorch.org
	- torchvision>=0.15.0: https://pytorch.org/vision/stable/index.html
	- xformers>=0.0.18: https://github.com/facebookresearch/xformers


	## Citation.

	If you are using this model, please cite our work:

	```
	@misc{filiot2025distillingfoundationmodelsrobust,
	title={Distilling foundation models for robust and efficient models in digital pathology},
	author={Alexandre Filiot and Nicolas Dop and Oussama Tchita and Auriane Riou and Thomas Peeters and Daria Valter and Marin Scalbert and Charlie Saillard and Geneviève Robin and Antoine Olivier},
	year={2025},
	eprint={2501.16239},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2501.16239},
	}
	```
	## Acknowledgements.

	### _Computing resources_.

	This work was granted access to the High-Performance Computing (HPC) resources of IDRIS under the allocations 2023-A0141012519, 2024-A0161012519 and 2024-GC011015442 made by GENCI.

	### _Code_.

	`H0-mini` was built upon [DINOv2](https://github.com/facebookresearch/dinov2) repository (Apache License 2.0).

	### _Datasets_.

	The results published here are partly based upon data generated by the TCGA Research Network: [https://www.cancer.gov/tcga](https://www.cancer.gov/tcga).


	## References

	1. Saillard, C., Jenatton, R., Llinares-López, F., Mariet, Z., Cahané, D., Durand, E., Vert, J.-P., 2024. H-optimus-0.
	2. Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., ... & Bojanowski, P. (2023). Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193.
	3. Ochi, M., Komura, D., Onoyama, T., Shinbo, K., Endo, H., Odaka, H., ... & Ishikawa, S. (2024). Registered multi-device/staining histology image dataset for domain-agnostic machine learning models. Scientific Data, 11(1), 330.
	4. Ilse, M., Tomczak, J., & Welling, M. (2018, July). Attention-based deep multiple instance learning. In International conference on machine learning (pp. 2127-2136). PMLR.