Model Card for mobilenet_v4_l_eu-common

A MobileNet v4 image classification model. This model was trained on the eu-common dataset containing common European bird species.

The species list is derived from the Collins bird guide [^1].

[^1]: Svensson, L., Mullarney, K., & Zetterström, D. (2022). Collins bird guide (3rd ed.). London, England: William Collins.

Model Details

  • Model Type: Image classification and detection backbone

  • Model Stats:

    • Params (M): 32.2
    • Input image size: 384 x 384
  • Dataset: eu-common (707 classes)

  • Papers:

Model Usage

Image Classification

import birder
from birder.inference.classification import infer_image

(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)

# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)

# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)

image = "path/to/image.jpeg"  # or a PIL image, must be loaded in RGB format
(out, _) = infer_image(net, image, transform)
# out is a NumPy array with shape of (1, 707), representing class probabilities.

Image Embeddings

import birder
from birder.inference.classification import infer_image

(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)

# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)

# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)

image = "path/to/image.jpeg"  # or a PIL image
(out, embedding) = infer_image(net, image, transform, return_embedding=True)
# embedding is a NumPy array with shape of (1, 1280)

Detection Feature Map

from PIL import Image
import birder

(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)

# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)

# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)

image = Image.open("path/to/image.jpeg")
features = net.detection_features(transform(image).unsqueeze(0))
# features is a dict (stage name -> torch.Tensor)
print([(k, v.size()) for k, v in features.items()])
# Output example:
# [('stage1', torch.Size([1, 48, 96, 96])),
#  ('stage2', torch.Size([1, 96, 48, 48])),
#  ('stage3', torch.Size([1, 192, 24, 24])),
#  ('stage4', torch.Size([1, 512, 12, 12]))]

Citation

@misc{qin2024mobilenetv4universalmodels,
      title={MobileNetV4 -- Universal Models for the Mobile Ecosystem},
      author={Danfeng Qin and Chas Leichner and Manolis Delakis and Marco Fornoni and Shixin Luo and Fan Yang and Weijun Wang and Colby Banbury and Chengxi Ye and Berkin Akin and Vaibhav Aggarwal and Tenghui Zhu and Daniele Moro and Andrew Howard},
      year={2024},
      eprint={2404.10518},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2404.10518},
}
Downloads last month
100
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support image-classification models for birder library.