This model is fine-tuned version of hustvl/yolos-tiny.

You can find details of model in this github repo -> fashion-visual-search

And you can find fashion image feature extractor model -> yainage90/fashion-image-feature-extractor

This model was trained using a combination of two datasets: modanet and fashionpedia

The labels are ['bag', 'bottom', 'dress', 'hat', 'shoes', 'outer', 'top']

In the 96th epoch out of total of 100 epochs, the best score was achieved with mAP 0.697400.

from PIL import Image
import torch
from transformers import  YolosImageProcessor, YolosForObjectDetection

device = 'cpu'
if torch.cuda.is_available():
    device = torch.device('cuda')
elif torch.backends.mps.is_available():
    device = torch.device('mps')

ckpt = 'yainage90/fashion-object-detection-yolos-tiny'
image_processor = YolosImageProcessor.from_pretrained(ckpt)
model = YolosForObjectDetection.from_pretrained(ckpt).to(device)

image = Image.open('<path/to/image>').convert('RGB')

with torch.no_grad():
    inputs = image_processor(images=[image], return_tensors="pt")
    outputs = model(**inputs.to(device))
    target_sizes = torch.tensor([[image.size[1], image.size[0]]])
    results = image_processor.post_process_object_detection(outputs, threshold=0.85, target_sizes=target_sizes)[0]

    items = []
    for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
        score = score.item()
        label = label.item()
        box = [i.item() for i in box]
        print(f"{model.config.id2label[label]}: {round(score, 3)} at {box}")
        items.append((score, label, box))

sample_image

Downloads last month
8
Safetensors
Model size
6.47M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for yainage90/fashion-object-detection-yolos-tiny

Base model

hustvl/yolos-tiny
Finetuned
(12)
this model