wendys-llc commited on
Commit
4aed93e
·
verified ·
1 Parent(s): b91569c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -27
README.md CHANGED
@@ -2,59 +2,147 @@
2
  license: apache-2.0
3
  tags:
4
  - image-classification
5
- - transformers
6
- - pytorch
 
7
  datasets:
8
  - wendys-llc/chkbx
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
- # Checkbox Classifier
12
 
13
- Binary classifier for checkbox states (checked/unchecked).
14
 
15
- ## Usage with Transformers
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  ```python
18
  from transformers import pipeline
 
19
 
20
- # Load pipeline
21
- classifier = pipeline("image-classification",
22
- model="wendys-llc/checkbox-classifier",
23
- trust_remote_code=True)
24
 
25
- # Predict
26
- from PIL import Image
27
  image = Image.open("checkbox.jpg")
28
- result = classifier(image)
29
- print(result)
30
- # [
31
- # {'label': 'checked', 'score': 0.99},
32
- # {'label': 'unchecked', 'score': 0.01}
33
- # ]
 
 
 
34
  ```
35
 
36
- ## Direct Usage
37
 
38
  ```python
39
- from transformers import AutoModelForImageClassification, AutoImageProcessor
40
  import torch
41
  from PIL import Image
42
 
43
- model = AutoModelForImageClassification.from_pretrained(
44
- "wendys-llc/checkbox-classifier",
45
- trust_remote_code=True
46
- )
47
- processor = AutoImageProcessor.from_pretrained("wendys-llc/checkbox-classifier")
48
 
 
49
  image = Image.open("checkbox.jpg")
50
  inputs = processor(images=image, return_tensors="pt")
51
 
 
52
  with torch.no_grad():
53
  outputs = model(**inputs)
54
  logits = outputs.logits
55
- predicted_class = logits.argmax(-1).item()
56
 
57
- print(model.config.id2label[predicted_class])
 
 
 
 
 
 
 
 
58
  ```
59
 
60
- ## Accuracy: 97.1%
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  tags:
4
  - image-classification
5
+ - computer-vision
6
+ - checkbox-detection
7
+ - efficientnet
8
  datasets:
9
  - wendys-llc/chkbx
10
+ metrics:
11
+ - accuracy
12
+ - f1
13
+ - precision
14
+ - recall
15
+ base_model: google/efficientnet-b0
16
+ model-index:
17
+ - name: checkbox-classifier-efficientnet
18
+ results:
19
+ - task:
20
+ type: image-classification
21
+ name: Image Classification
22
+ dataset:
23
+ type: wendys-llc/chkbx
24
+ name: Checkbox Detection Dataset
25
+ split: validation
26
+ metrics:
27
+ - type: accuracy
28
+ value: 0.97
29
+ name: Validation Accuracy
30
+ library_name: transformers
31
+ pipeline_tag: image-classification
32
  ---
33
 
34
+ # Checkbox State Classifier - EfficientNet-B0
35
 
36
+ A fine-tuned EfficientNet-B0 model for binary classification of checkbox states (checked/unchecked). This model achieves ~95% accuracy on UI checkbox detection.
37
 
38
+ ## Model Description
39
+
40
+ This model is fine-tuned from [google/efficientnet-b0](https://huggingface.co/google/efficientnet-b0) on the [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx) dataset. It's designed to classify UI checkboxes in screenshots and interface images.
41
+
42
+ ### Key Features
43
+ - **No `trust_remote_code` required** - Uses native transformers support
44
+ - **Fast inference** - EfficientNet-B0 is optimized for speed
45
+ - **High accuracy** - ~95% on validation set
46
+ - **Simple API** - Works with transformers pipeline out of the box
47
+
48
+ ## Usage
49
+
50
+ ### Quick Start with Pipeline (Recommended)
51
 
52
  ```python
53
  from transformers import pipeline
54
+ from PIL import Image
55
 
56
+ # Load the model
57
+ classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")
 
 
58
 
59
+ # Classify an image
 
60
  image = Image.open("checkbox.jpg")
61
+ results = classifier(image)
62
+
63
+ # Print results
64
+ for result in results:
65
+ print(f"{result['label']}: {result['score']:.2%}")
66
+
67
+ # Get just the top prediction
68
+ top_result = classifier(image, top_k=1)[0]
69
+ print(f"Checkbox is: {top_result['label']} (confidence: {top_result['score']:.2%})")
70
  ```
71
 
72
+ ### Using AutoModel and AutoImageProcessor
73
 
74
  ```python
75
+ from transformers import AutoImageProcessor, AutoModelForImageClassification
76
  import torch
77
  from PIL import Image
78
 
79
+ # Load model and processor
80
+ processor = AutoImageProcessor.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
81
+ model = AutoModelForImageClassification.from_pretrained("wendys-llc/checkbox-classifier-efficientnet")
 
 
82
 
83
+ # Prepare image
84
  image = Image.open("checkbox.jpg")
85
  inputs = processor(images=image, return_tensors="pt")
86
 
87
+ # Get prediction
88
  with torch.no_grad():
89
  outputs = model(**inputs)
90
  logits = outputs.logits
 
91
 
92
+ # Get predicted class
93
+ predicted_class_idx = logits.argmax(-1).item()
94
+ predicted_label = model.config.id2label[predicted_class_idx]
95
+
96
+ # Get confidence scores
97
+ probabilities = torch.nn.functional.softmax(logits, dim=-1)
98
+ confidence = probabilities.max().item()
99
+
100
+ print(f"Prediction: {predicted_label} (confidence: {confidence:.2%})")
101
  ```
102
 
103
+ ### Batch Processing
104
+
105
+ ```python
106
+ from transformers import pipeline
107
+ from PIL import Image
108
+
109
+ classifier = pipeline("image-classification", model="wendys-llc/checkbox-classifier-efficientnet")
110
+
111
+ # Process multiple images
112
+ images = [Image.open(f"checkbox_{i}.jpg") for i in range(1, 4)]
113
+ results = classifier(images)
114
+
115
+ for i, result in enumerate(results):
116
+ top_pred = result[0] # Get top prediction
117
+ print(f"Image {i+1}: {top_pred['label']} ({top_pred['score']:.2%})")
118
+ ```
119
+
120
+ ## Model Details
121
+
122
+ ### Architecture
123
+ - **Base Model**: google/efficientnet-b0
124
+ - **Model Type**: EfficientNet for Image Classification
125
+ - **Number of Labels**: 2 (checked, unchecked)
126
+ - **Input Size**: 224x224 RGB images
127
+ - **Framework**: PyTorch via Transformers
128
+
129
+ ### Training Details
130
+ - **Dataset**: [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx)
131
+ - ~4,800 training samples
132
+ - ~1,200 validation samples
133
+ - **Training Configuration**:
134
+ - Epochs: 15 (with early stopping)
135
+ - Batch Size: 64 (on A100)
136
+ - Learning Rate: Default AdamW
137
+ - Mixed Precision: FP16
138
+ - Hardware: NVIDIA A100 GPU
139
+
140
+ ## Acknowledgments
141
+
142
+ - Base model: [google/efficientnet-b0](https://huggingface.co/google/efficientnet-b0)
143
+ - Dataset: [wendys-llc/chkbx](https://huggingface.co/datasets/wendys-llc/chkbx)
144
+ - Framework: [HuggingFace Transformers](https://github.com/huggingface/transformers)
145
+
146
+ ## License
147
+
148
+ This model is licensed under the Apache 2.0 License. See the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) file for details.