zer0int's picture
OpenAI's CLIP-L@336 Text Encoder
c109d26 verified
metadata
license: mit
base_model:
  - openai/clip-vit-large-patch14-336

This is NOT a fine-tune.

This is the original OpenAI CLIP ViT-L/14@336 Text Encoder, converted to HuggingFace 'transformers' format. All credits to the original authors.

Why?

  • It's a normal "CLIP-L" Text Encoder and can be used as such.
  • See below (Flux.1-dev, CLIP only guidance, CFG 3.5, Heun).
  • For my fine-tuned KO-CLIP ViT-L/14@336 -> see here

image/jpeg

image/jpeg