Image Classification Model (ViT)

This is an image classification model based on Vision Transformer (ViT), fine-tuned on the MNIST dataset. The model is designed to classify images into one of 10 possible classes (digits 0-9). The code is compatible with Hugging Face's inference providers and can be easily deployed.

Model Details

  • Model Type: Vision Transformer (ViT)
  • Base Model: google/vit-base-patch16-224
  • Task: Image Classification
  • Dataset: MNIST (handwritten digits)
  • Labels: 10 classes (0-9)

How to Use

Install Requirements

Make sure you have the following dependencies installed:

pip3 install requirements.txt

Run unit tests

python3 -m unittest discover -s tests
Downloads last month
11
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Space using SupremoUGH/image-classification-model 1