Model Card for Thesis-CLIP-geoloc-continent

CLIP-ViT model fine-tuned for image geolocation. Optimized for queries at country-level.

Model Details

Model Description

Model Sources

How to Get Started with the Model

from PIL import Image
import requests
from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("jrheiner/thesis-clip-geoloc-continent")
processor = CLIPProcessor.from_pretrained("jrheiner/thesis-clip-geoloc-continent")

url = "https://huggingface.co/spaces/jrheiner/thesis-demo/resolve/main/kerger-test-images/Oceania_Australia_-32.947127313081_151.47903359833_kerger.jpg"
image = Image.open(requests.get(url, stream=True).raw)
choices = ["Botswana", "Eswatini", "Ghana", "Kenya", "Lesotho", "Nigeria", "Senegal", "South Africa", "Rwanda", "Uganda", "Tanzania", "Madagascar", "Djibouti", "Mali", "Libya", "Morocco", "Somalia", "Tunisia", "Egypt", "Réunion", "Bangladesh", "Bhutan", "Cambodia", "China", "India", "Indonesia", "Israel", "Japan", "Jordan", "Kyrgyzstan", "Laos", "Malaysia", "Mongolia", "Nepal", "Palestine", "Philippines", "Singapore", "South Korea", "Sri Lanka", "Taiwan", "Thailand", "United Arab Emirates", "Vietnam", "Afghanistan", "Azerbaijan", "Cyprus", "Iran", "Syria", "Tajikistan", "Turkey", "Russia", "Pakistan", "Hong Kong", "Albania", "Andorra", "Austria", "Belgium", "Bulgaria", "Croatia", "Czechia", "Denmark", "Estonia", "Finland", "France", "Germany", "Greece", "Hungary", "Iceland", "Ireland", "Italy", "Latvia", "Lithuania", "Luxembourg", "Montenegro", "Netherlands", "North Macedonia", "Norway", "Poland", "Portugal", "Romania", "Russia", "Serbia", "Slovakia", "Slovenia", "Spain", "Sweden", "Switzerland", "Ukraine", "United Kingdom", "Bosnia and Herzegovina", "Cyprus", "Turkey", "Greenland", "Faroe Islands", "Canada", "Dominican Republic", "Guatemala", "Mexico", "United States", "Bahamas", "Cuba", "Panama", "Puerto Rico", "Bermuda", "Greenland", "Australia", "New Zealand", "Fiji", "Papua New Guinea", "Solomon Islands", "Vanuatu", "Argentina", "Bolivia", "Brazil", "Chile", "Colombia", "Ecuador", "Paraguay", "Peru", "Uruguay"]
inputs = processor(text=choices, images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities

Training Details

The model was fine-tuned on 177 270 images (29 545 per continent) sourced from Mapillary.

Downloads last month
11
Safetensors
Model size
428M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for jrheiner/thesis-clip-geoloc-country

Finetuned
(19)
this model

Space using jrheiner/thesis-clip-geoloc-country 1