Checkbox State Classifier - EfficientNet-B0

A fine-tuned EfficientNet-B0 model for binary classification of checkbox states (checked/unchecked). This model achieves ~95% accuracy on UI checkbox detection.

Model Description

This model is fine-tuned from google/efficientnet-b0 on the wendys-llc/chkbx dataset. It's designed to classify UI checkboxes in screenshots and interface images.

Key Features

No trust_remote_code required - Uses native transformers support
Fast inference - EfficientNet-B0 is optimized for speed
High accuracy - ~95% on validation set
Simple API - Works with transformers pipeline out of the box

Usage

Quick Start with Pipeline (Recommended)

from transformers import pipeline
from PIL import Image

# Load the model
classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")

# Classify an image
image = Image.open("checkbox.jpg")
results = classifier(image)

# Print results
for result in results:
    print(f"{result['label']}: {result['score']:.2%}")

# Get just the top prediction
top_result = classifier(image, top_k=1)[0]
print(f"Checkbox is: {top_result['label']} (confidence: {top_result['score']:.2%})")

Using AutoModel and AutoImageProcessor

from transformers import AutoImageProcessor, AutoModelForImageClassification
import torch
from PIL import Image

# Load model and processor
processor = AutoImageProcessor.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
model = AutoModelForImageClassification.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")

# Prepare image
image = Image.open("checkbox.jpg")
inputs = processor(images=image, return_tensors="pt")

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    
    # Get predicted class
    predicted_class_idx = logits.argmax(-1).item()
    predicted_label = model.config.id2label[predicted_class_idx]
    
    # Get confidence scores
    probabilities = torch.nn.functional.softmax(logits, dim=-1)
    confidence = probabilities.max().item()

print(f"Prediction: {predicted_label} (confidence: {confidence:.2%})")

Batch Processing

from transformers import pipeline
from PIL import Image

classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")

# Process multiple images
images = [Image.open(f"checkbox_{i}.jpg") for i in range(1, 4)]
results = classifier(images)

for i, result in enumerate(results):
    top_pred = result[0]  # Get top prediction
    print(f"Image {i+1}: {top_pred['label']} ({top_pred['score']:.2%})")

Model Details

Architecture

Base Model: google/efficientnet-b0
Model Type: EfficientNet for Image Classification
Number of Labels: 2 (checked, unchecked)
Input Size: 224x224 RGB images
Framework: PyTorch via Transformers

Training Details

Dataset: wendys-llc/chkbx
- ~4,800 training samples
- ~1,200 validation samples
Training Configuration:
- Epochs: 15 (with early stopping)
- Batch Size: 64 (on A100)
- Learning Rate: Default AdamW
- Mixed Precision: FP16
- Hardware: NVIDIA A100 GPU

Acknowledgments

License

This model is licensed under the Apache 2.0 License. See the LICENSE file for details.

wendys-llc
/

checkbox-classifier