wendys-llc
/

checkbox-classifier

@@ -2,59 +2,147 @@
 license: apache-2.0
 tags:
 - image-classification
-- transformers
-- pytorch
 datasets:
 - wendys-llc/chkbx
 ---
-# Checkbox Classifier
-Binary classifier for checkbox states (checked/unchecked).
-## Usage with Transformers
 ```python
 from transformers import pipeline
-# Load pipeline
-classifier = pipeline("image-classification",
-                     model="wendys-llc/checkbox-classifier",
-                     trust_remote_code=True)
-# Predict
-from PIL import Image
 image = Image.open("checkbox.jpg")
-result = classifier(image)
-print(result)
-# [
-#   {'label': 'checked', 'score': 0.99},
-#   {'label': 'unchecked', 'score': 0.01}
-# ]
 ```
-## Direct Usage
 ```python
-from transformers import AutoModelForImageClassification, AutoImageProcessor
 import torch
 from PIL import Image
-model = AutoModelForImageClassification.from_pretrained(
-    "wendys-llc/checkbox-classifier",
-    trust_remote_code=True
-)
-processor = AutoImageProcessor.from_pretrained("wendys-llc/checkbox-classifier")
 image = Image.open("checkbox.jpg")
 inputs = processor(images=image, return_tensors="pt")
 with torch.no_grad():
     outputs = model(**inputs)
     logits = outputs.logits
-    predicted_class = logits.argmax(-1).item()
-print(model.config.id2label[predicted_class])
 ```
-## Accuracy: 97.1%

 license: apache-2.0
 tags:
 - image-classification
+- computer-vision
+- checkbox-detection
+- efficientnet
 datasets:
 - wendys-llc/chkbx
+metrics:
+- accuracy
+- f1
+- precision
+- recall
+base_model: google/efficientnet-b0
+model-index:
+- name: checkbox-classifier-efficientnet
+  results:
+  - task:
+      type: image-classification
+      name: Image Classification
+    dataset:
+      type: wendys-llc/chkbx
+      name: Checkbox Detection Dataset
+      split: validation
+    metrics:
+    - type: accuracy
+      value: 0.97
+      name: Validation Accuracy
+library_name: transformers
+pipeline_tag: image-classification
 ---
+# Checkbox State Classifier - EfficientNet-B0
+A fine-tuned EfficientNet-B0 model for binary classification of checkbox states (checked/unchecked). This model achieves ~95% accuracy on UI checkbox detection.
+## Model Description
+This model is fine-tuned from [google/efficientnet-b0](https://huggingface.co/google/efficientnet-b0) on the [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx) dataset. It's designed to classify UI checkboxes in screenshots and interface images.
+### Key Features
+- **No `trust_remote_code` required** - Uses native transformers support
+- **Fast inference** - EfficientNet-B0 is optimized for speed
+- **High accuracy** - ~95% on validation set
+- **Simple API** - Works with transformers pipeline out of the box
+## Usage
+### Quick Start with Pipeline (Recommended)
 ```python
 from transformers import pipeline
+from PIL import Image
+# Load the model
+classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")
+# Classify an image
 image = Image.open("checkbox.jpg")
+results = classifier(image)
+# Print results
+for result in results:
+    print(f"{result['label']}: {result['score']:.2%}")
+# Get just the top prediction
+top_result = classifier(image, top_k=1)[0]
+print(f"Checkbox is: {top_result['label']} (confidence: {top_result['score']:.2%})")
 ```
+### Using AutoModel and AutoImageProcessor
 ```python
+from transformers import AutoImageProcessor, AutoModelForImageClassification
 import torch
 from PIL import Image
+# Load model and processor
+processor = AutoImageProcessor.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
+model = AutoModelForImageClassification.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
+# Prepare image
 image = Image.open("checkbox.jpg")
 inputs = processor(images=image, return_tensors="pt")
+# Get prediction
 with torch.no_grad():
     outputs = model(**inputs)
     logits = outputs.logits
+    # Get predicted class
+    predicted_class_idx = logits.argmax(-1).item()
+    predicted_label = model.config.id2label[predicted_class_idx]
+    # Get confidence scores
+    probabilities = torch.nn.functional.softmax(logits, dim=-1)
+    confidence = probabilities.max().item()
+print(f"Prediction: {predicted_label} (confidence: {confidence:.2%})")
 ```
+### Batch Processing
+```python
+from transformers import pipeline
+from PIL import Image
+classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")
+# Process multiple images
+images = [Image.open(f"checkbox_{i}.jpg") for i in range(1, 4)]
+results = classifier(images)
+for i, result in enumerate(results):
+    top_pred = result[0]  # Get top prediction
+    print(f"Image {i+1}: {top_pred['label']} ({top_pred['score']:.2%})")
+```
+## Model Details
+### Architecture
+- **Base Model**: google/efficientnet-b0
+- **Model Type**: EfficientNet for Image Classification
+- **Number of Labels**: 2 (checked, unchecked)
+- **Input Size**: 224x224 RGB images
+- **Framework**: PyTorch via Transformers
+### Training Details
+- **Dataset**: [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx)
+  - ~4,800 training samples
+  - ~1,200 validation samples
+- **Training Configuration**:
+  - Epochs: 15 (with early stopping)
+  - Batch Size: 64 (on A100)
+  - Learning Rate: Default AdamW
+  - Mixed Precision: FP16
+  - Hardware: NVIDIA A100 GPU
+## Acknowledgments
+- Base model: [google/efficientnet-b0](https://huggingface.co/google/efficientnet-b0)
+- Dataset: [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx)
+- Framework: [HuggingFace Transformers](https://github.com/huggingface/transformers)
+## License
+This model is licensed under the Apache 2.0 License. See the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) file for details.