fix: added more details

Browse files

Files changed (1) hide show

README.md +52 -1

README.md CHANGED Viewed

	@@ -1 +1,52 @@
1	- [OpenAI CLIP model](https://openai.com/blog/clip/) fine-tuned using image-caption pairs from the [Caption Prediction dataset](https://www.imageclef.org/2017/caption) provided for the ImageCLEF 2017 competition. The model was evaluated using before and after fine-tuning, the MRR@10 were 0.57 and 0.88 respectively.

+---
+language:
+- en
+thumbnail:
+tags:
+- multimodal
+- language
+- vision
+- image-search
+license:
+- mit
+metrics:
+- MRR
+---
+### Model Card: clip-imageclef
+### Model Details
+[OpenAI CLIP model](https://openai.com/blog/clip/) fine-tuned using image-caption pairs from the [Caption Prediction dataset](https://www.imageclef.org/2017/caption) provided for the ImageCLEF 2017 competition. The model was evaluated using before and after fine-tuning, MRR@10 were 0.57 and 0.88 respectively.
+### Model Date
+September 6, 2021
+### Model Type
+The base model is the OpenAI CLIP model. It uses a ViT-B/32 Transformer architecture as an image encoder and uses a masked self-attention Transformer as a text encoder. These encoders are trained to maximize the similarity of (image, text) pairs via a contrastive loss.
+### Fine-tuning
+The fine-tuning can be reproduced using code from the Github repository [elsevierlabs-os/clip-image-search]([https://github.com/elsevierlabs-os/clip-image-search#fine-tuning).
+### Usage
+```
+from transformers import CLIPModel, CLIPProcessor
+model = CLIPModel.from_pretrained("sujitpal/clip-imageclef")
+processor = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
+inputs = processor(text=captions, images=images,
+                   return_tensors="pt", padding=True)
+output = model(**inputs)
+```
+### Performance
+| Model-name                       | k=1   | k=3   | k=5   | k=10  | k=20  |
+| -------------------------------- | ----- | ----- | ----- | ----- | ----- |
+| zero-shot CLIP (baseline)        | 0.426 | 0.534 | 0.558 | 0.573 | 0.578 |
+| clip-imageclef (this model)      | 0.802 | 0.872 | 0.877 | 0.879 | 0.880 |