File size: 4,207 Bytes

---
library_name: torch
tags:
- image-classification
- resnet
- diagrams
- pytorch
- computer-vision
license: apache-2.0
metrics:
- accuracy
- f1
- recall
- precision
base_model:
- microsoft/resnet-18
pipeline_tag: image-classification
datasets:
- phiyodr/coco2017
- HuggingFaceM4/ChartQA
- JasmineQiuqiu/diagrams_with_captions_2
---

# Model Card for Diagram Classification Model

## Model Details

### Model Description

This is a fine-tuned ResNet-18 model trained for binary image classification, distinguishing between **diagrams** and **non-diagrams**. The model is designed for use in applications that need automatic filtering or processing of diagram-based content.

- **Developed by:** Aya Mohamed
- **Model type:** ResNet-18 (Fine-tuned for image classification)
- **Language(s) (NLP):** Not applicable (Computer Vision model)
- **License:** Apache 2.0
- **Finetuned from model:** `microsoft/resnet-18`

### Model Sources

- **Repository:** [Ayamohamed/diaclass-model](https://huggingface.co/Ayamohamed/diaclass-model)

## Uses

### Direct Use

This model is intended for classifying images as **diagrams** or **non-diagrams**. It can be used in:
- **Document processing** (extracting diagrams from PDFs or scanned documents)
- **Chart-based visual question generation (VQG)**
- **Content moderation** (filtering diagram images from general image datasets)

### Out-of-Scope Use

- Not suitable for **multi-class classification** beyond diagrams vs. non-diagrams.
- Not designed for **hand-drawn sketches** or **complex figures with mixed elements**.

## Bias, Risks, and Limitations

- The model's accuracy depends on the training dataset, which may not cover all possible diagram styles.
- May misclassify **charts, blueprints, or artistic drawings** if they resemble diagrams.

### Recommendations

Users should **evaluate the model** on their specific dataset before deployment to ensure it performs well in their context.



## 🚀 How to Use

### **1️⃣ Load the Model from Hugging Face**
You can download the model and load it using `torch`.

```python
import torch
from huggingface_hub import hf_hub_download

# Download model from Hugging Face Hub
model_path = hf_hub_download(repo_id="Ayamohamed/DiaClassification", filename="model.pth")

# Load model
model_hg = torch.load(model_path)
model_hg.eval()  # Set to evaluation mode

```
### **2️⃣ Preprocess and Classify an Image**
```python
from PIL import Image
from torchvision import transforms

# Define Image Transformations
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
def predict(image_path):
    image = Image.open(image_path).convert("RGB")  
    image = transform(image).unsqueeze(0) 
    with torch.no_grad():
        output = model_hg(image)
        class_idx = torch.argmax(output, dim=1).item()

    return "Diagram" if class_idx == 0 else "Not Diagram"

# Example usage
print(predict("my-diagram-classifier/31188_1536932698.jpg"))


```



## Training Details

### Training Data

The model was trained using:
- **ChartQA dataset** (for diagram samples)
- **JasmineQiuqiu/diagrams_with_captions_2** (for diagram samples)
- **COCO dataset (subset)** (for non-diagram samples)

### Training Procedure

- **Pretrained model:** `microsoft/resnet-18`
- **Optimization:** Adam optimizer
- **Loss function:** Cross-entropy loss
- **Training duration:** Approx. X hours on an NVIDIA GPU

## Evaluation

### Testing Data & Metrics

- **Dataset:** Held-out test set from ChartQA, AI2D-RST, and COCO
- **Metrics:**
  - **Test Loss:** 0.0371
  - **Test Accuracy:** 99.08%
  - **Precision:** 0.9995
  - **Recall:** 0.9820
  - **F1 Score:** 0.9907

## Environmental Impact

- **Hardware Used:** NVIDIA A100 GPU
- **Compute Hours:** Approx. X hours
- **Estimated Carbon Emission:** [Use MLCO2 Calculator](https://mlco2.github.io/impact#compute)

## Citation

If you use this model, please cite:

```bibtex
@misc{aya2025diaclass,
  author = {Aya Mohamed},
  title = {Diagram Classification Model},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Ayamohamed/diaclass-model}
}
```