stefanodangelo commited on
Commit
c952138
·
verified ·
1 Parent(s): 1b5ec80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +163 -1
README.md CHANGED
@@ -5,7 +5,169 @@ language:
5
  metrics:
6
  - accuracy
7
  - f1
 
 
8
  base_model:
9
  - microsoft/swin-large-patch4-window7-224
10
  pipeline_tag: image-classification
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  metrics:
6
  - accuracy
7
  - f1
8
+ - precision
9
+ - recall
10
  base_model:
11
  - microsoft/swin-large-patch4-window7-224
12
  pipeline_tag: image-classification
13
+ ---
14
+ # Model Card for ChartDet
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ **ChartDet** is an implementation of the Swin Transformer block used by the ChartEye model, adapted for chart classification tasks. While the ChartEye paper focuses on identifying specific chart types, this model is trained to distinguish between charts and non-chart images.
21
+
22
+ - **Developed by:** Stefano D’Angelo
23
+ - **Model type:** Image Classification (Swin Transformer)
24
+ - **Language(s) (NLP):** Not applicable
25
+ - **License:** MIT
26
+ - **Finetuned from model :** microsoft/swin-large-patch4-window7-224
27
+
28
+ ### Model Sources
29
+
30
+ - **Repository:** [ChartDet GitHub Repository](https://github.com/stefanodangelo/ChartDet)
31
+ - **Paper :** [ChartEye Paper](https://arxiv.org/abs/2408.16123)
32
+
33
+ ## Uses
34
+
35
+ ### Direct Use
36
+
37
+ This model can be used to classify images into chart and non-chart categories directly.
38
+
39
+ ### Downstream Use
40
+
41
+ The model can be fine-tuned further for specific chart type classification tasks or integrated into applications for automated document analysis.
42
+
43
+ ### Out-of-Scope Use
44
+
45
+ The model is not designed for identifying specific chart types (e.g., bar, line, pie) or for tasks outside chart detection.
46
+
47
+ ### Bias, Risks, and Limitations
48
+
49
+ The model was trained on datasets like ICPR2022 CHARTINFO, PACS, and DomainNet, which may not fully represent all chart types or images in real-world scenarios. Potential biases include:
50
+
51
+ - **Dataset Bias:** The training datasets may underrepresent certain chart styles or image types, impacting model generalization.
52
+ - **Domain Limitations:** Performance may degrade on charts or images from unseen domains or with significant visual noise.
53
+ - **Misclassification Risk:** Non-chart images with chart-like features (e.g., diagrams) may occasionally be misclassified.
54
+
55
+ Users should carefully evaluate the model on their specific data to ensure compatibility and adjust as needed.
56
+
57
+ ### Recommendations
58
+
59
+ - Users should ensure that input data matches the training domain to achieve optimal performance.
60
+ - Avoid using the model for unrelated image classification tasks without fine-tuning.
61
+
62
+ ## How to Get Started with the Model
63
+
64
+ Use the following code snippet to get started:
65
+
66
+ ```python
67
+ Coming soon
68
+ ```
69
+
70
+ ## Training Details
71
+
72
+ ### Training Data
73
+
74
+ The model was trained on a combination of:
75
+
76
+ - [ICPR2022 CHARTINFO UB PMC competition dataset](https://www.kaggle.com/datasets/pranithchowdary/icpr-2022?resource=download-directory)
77
+ - [PACS dataset](https://paperswithcode.com/dataset/pacs)
78
+ - [DomainNet dataset](https://paperswithcode.com/dataset/domainnet)
79
+
80
+ ### Training Procedure
81
+
82
+ The model was fine-tuned using the following setup:
83
+
84
+ #### Preprocessing
85
+
86
+ Images were preprocessed to match the input requirements of the Swin Transformer model (e.g., resizing, normalization).
87
+
88
+ #### Training Hyperparameters
89
+
90
+ - **Optimizer:** Adam
91
+ - **Loss Function:** CrossEntropyLoss
92
+ - **Batch Size:** 8
93
+ - **Epochs:** 12
94
+ - **Learning Rate:** 3e-6
95
+ - **Seed:** 42
96
+
97
+ ## Evaluation
98
+
99
+ ### Testing Data, Factors & Metrics
100
+
101
+ #### Testing Data
102
+
103
+ Evaluation used subsets of the datasets mentioned above, with metrics computed on held-out validation or test splits on an 80-20 split strategy.
104
+
105
+ #### Factors
106
+
107
+ Performance was assessed across images from diverse domains (e.g., charts vs. natural images).
108
+
109
+ #### Metrics
110
+
111
+ Evaluation metrics included:
112
+
113
+ - Accuracy
114
+ - Confusion Matrix
115
+ - Classification Report (e.g., Precision, Recall, F1-Score)
116
+
117
+ ### Results
118
+
119
+ Results indicate effective performance in distinguishing between charts and non-chart images. Quantitative results are as follows:
120
+ - Accuracy: 99.89%
121
+ - Precision (Weighted): 99.80%
122
+ - Recall (Weighted): 99.93%
123
+ - F1-Score (Weighted): 99.87%
124
+
125
+ #### Summary
126
+
127
+ The model achieves reliable classification for the intended task within the training domain.
128
+
129
+ ## Environmental Impact
130
+
131
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
132
+
133
+ - **Hardware Type:** NVIDIA GeForce RTX 4070 Ti
134
+ - **Hours used:** ~2.47 hours (total training time: 12 epochs x ~740 seconds per epoch = ~8880 seconds)
135
+ - **Cloud Provider:** Local (no cloud provider used)
136
+ - **Compute Region:** Not applicable (local machine)
137
+ - **Carbon Emitted:** Not computed (local compute environment without data on power source or emissions factor)
138
+
139
+ ## Technical Specifications
140
+
141
+ ### Model Architecture and Objective
142
+
143
+ The model uses a Swin Transformer-based architecture adapted for binary image classification.
144
+
145
+ ### Compute Infrastructure
146
+
147
+ #### Hardware
148
+
149
+ - NVIDIA GeForce RTX 4070 Ti
150
+
151
+ #### Software
152
+
153
+ - Windows 11
154
+ - Python 3.11
155
+ - HuggingFace Transformers Library
156
+ - PyTorch
157
+
158
+ ## Citation
159
+
160
+ **BibTeX:**
161
+
162
+ ```bibtex
163
+ @misc{chartdet2025,
164
+ author = {Stefano D’Angelo},
165
+ title = {ChartDet: A Swin Transformer Model for Chart Classification},
166
+ year = {2025},
167
+ howpublished = {\url{https://huggingface.co/stefanodangelo/chartdet}}
168
+ }
169
+ ```
170
+
171
+ ## Model Card Authors
172
+
173
+ Stefano D’Angelo