google
/

siglip2-base-patch16-naflex-jax

Zero-Shot Image Classification

Model card Files Files and versions Community

siglip2-base-patch16-naflex-jax / README.md

ariG23498's picture

ariG23498 HF staff

Upload README.md with huggingface_hub (#1)

db90447 verified 4 days ago

|

1.13 kB

	---
	license: apache-2.0
	tags:
	- vision
	---

	# SigLIP 2 Base

	[SigLIP 2](https://huggingface.co/collections/google/siglip2-67b5dcef38c175486e240107)
	extends the pretraining objective of
	[SigLIP](https://huggingface.co/collections/google/siglip-659d5e62f0ae1a57ae0e83ba)
	with prior, independently developed techniques into a unified recipe, for improved semantic
	understanding, localization, and dense features.

	## Intended uses

	You can use the raw model for tasks like zero-shot image classification and
	image-text retrieval, or as a vision encoder for VLMs (and other vision tasks).


	## Training procedure

	SigLIP 2 adds some clever training objectives on top of SigLIP:

	1. Decoder loss
	2. Global-local and masked prediction loss
	3. Aspect ratio and resolution adaptibility

	### Training data

	SigLIP 2 is pre-trained on the WebLI dataset [(Chen et al., 2023)](https://arxiv.org/abs/2209.06794).

	### Compute

	The model was trained on up to 2048 TPU-v5e chips.

	## Evaluation results

	Evaluation of SigLIP 2 is shown below (taken from the paper).

	[Evaluation Table](TODO)

	### BibTeX entry and citation info

	```bibtex
	TODO
	```