metadata
license: apache-2.0
tags:
- image-classification
- computer-vision
- checkbox-detection
- efficientnet
datasets:
- wendys-llc/chkbx
metrics:
- accuracy
- f1
- precision
- recall
base_model: google/efficientnet-b0
model-index:
- name: checkbox-classifier-efficientnet
results:
- task:
type: image-classification
name: Image Classification
dataset:
type: wendys-llc/chkbx
name: Checkbox Detection Dataset
split: validation
metrics:
- type: accuracy
value: 0.97
name: Validation Accuracy
library_name: transformers
pipeline_tag: image-classification
Checkbox State Classifier - EfficientNet-B0
A fine-tuned EfficientNet-B0 model for binary classification of checkbox states (checked/unchecked). This model achieves ~95% accuracy on UI checkbox detection.
Model Description
This model is fine-tuned from google/efficientnet-b0 on the wendys-llc/chkbx dataset. It's designed to classify UI checkboxes in screenshots and interface images.
Key Features
- No
trust_remote_code
required - Uses native transformers support - Fast inference - EfficientNet-B0 is optimized for speed
- High accuracy - ~95% on validation set
- Simple API - Works with transformers pipeline out of the box
Usage
Quick Start with Pipeline (Recommended)
from transformers import pipeline
from PIL import Image
# Load the model
classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")
# Classify an image
image = Image.open("checkbox.jpg")
results = classifier(image)
# Print results
for result in results:
print(f"{result['label']}: {result['score']:.2%}")
# Get just the top prediction
top_result = classifier(image, top_k=1)[0]
print(f"Checkbox is: {top_result['label']} (confidence: {top_result['score']:.2%})")
Using AutoModel and AutoImageProcessor
from transformers import AutoImageProcessor, AutoModelForImageClassification
import torch
from PIL import Image
# Load model and processor
processor = AutoImageProcessor.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
model = AutoModelForImageClassification.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
# Prepare image
image = Image.open("checkbox.jpg")
inputs = processor(images=image, return_tensors="pt")
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
# Get predicted class
predicted_class_idx = logits.argmax(-1).item()
predicted_label = model.config.id2label[predicted_class_idx]
# Get confidence scores
probabilities = torch.nn.functional.softmax(logits, dim=-1)
confidence = probabilities.max().item()
print(f"Prediction: {predicted_label} (confidence: {confidence:.2%})")
Batch Processing
from transformers import pipeline
from PIL import Image
classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")
# Process multiple images
images = [Image.open(f"checkbox_{i}.jpg") for i in range(1, 4)]
results = classifier(images)
for i, result in enumerate(results):
top_pred = result[0] # Get top prediction
print(f"Image {i+1}: {top_pred['label']} ({top_pred['score']:.2%})")
Model Details
Architecture
- Base Model: google/efficientnet-b0
- Model Type: EfficientNet for Image Classification
- Number of Labels: 2 (checked, unchecked)
- Input Size: 224x224 RGB images
- Framework: PyTorch via Transformers
Training Details
- Dataset: wendys-llc/chkbx
- ~4,800 training samples
- ~1,200 validation samples
- Training Configuration:
- Epochs: 15 (with early stopping)
- Batch Size: 64 (on A100)
- Learning Rate: Default AdamW
- Mixed Precision: FP16
- Hardware: NVIDIA A100 GPU
Acknowledgments
- Base model: google/efficientnet-b0
- Dataset: wendys-llc/chkbx
- Framework: HuggingFace Transformers
License
This model is licensed under the Apache 2.0 License. See the LICENSE file for details.