Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
inference-net
/
ClipTagger-12b
like
47
Follow
Inference R&D
25
Image-Text-to-Text
Safetensors
English
gemma3
VLM
video-understanding
image-captioning
gemma
json-mode
structured-output
video-analysis
conversational
Eval Results
compressed-tensors
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
2
main
ClipTagger-12b
File size: 70 Bytes
6733c82
1
2
3
4
5
{
"image_seq_length"
:
256
,
"processor_class"
:
"Gemma3Processor"
}