Zero-Shot Image Classification
vision
ariG23498 HF staff commited on
Commit
022ee87
·
verified ·
1 Parent(s): 537448c

Upload README.md with huggingface_hub (#1)

Browse files

- Upload README.md with huggingface_hub (c212b9b05e11fe19435d73d3e0bd4fd5a716e0f6)

Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - vision
5
+ ---
6
+
7
+ # SigLIP 2 Large
8
+
9
+ [SigLIP 2](https://huggingface.co/collections/google/siglip2-67b5dcef38c175486e240107)
10
+ extends the pretraining objective of
11
+ [SigLIP](https://huggingface.co/collections/google/siglip-659d5e62f0ae1a57ae0e83ba)
12
+ with prior, independently developed techniques into a unified recipe, for improved semantic
13
+ understanding, localization, and dense features.
14
+
15
+ ## Intended uses
16
+
17
+ You can use the raw model for tasks like zero-shot image classification and
18
+ image-text retrieval, or as a vision encoder for VLMs (and other vision tasks).
19
+
20
+
21
+ ## Training procedure
22
+
23
+ SigLIP 2 adds some clever training objectives on top of SigLIP:
24
+
25
+ 1. Decoder loss
26
+ 2. Global-local and masked prediction loss
27
+ 3. Aspect ratio and resolution adaptibility
28
+
29
+ ### Training data
30
+
31
+ SigLIP 2 is pre-trained on the WebLI dataset [(Chen et al., 2023)](https://arxiv.org/abs/2209.06794).
32
+
33
+ ### Compute
34
+
35
+ The model was trained on up to 2048 TPU-v5e chips.
36
+
37
+ ## Evaluation results
38
+
39
+ Evaluation of SigLIP 2 is shown below (taken from the paper).
40
+
41
+ [Evaluation Table](TODO)
42
+
43
+ ### BibTeX entry and citation info
44
+
45
+ ```bibtex
46
+ TODO
47
+ ```