aswincandra's picture
Update README.md
db8b244
metadata
base_model: aswincandra/rgai-air-pollution-image-classification
metrics:
  - accuracy
model-index:
  - name: rgai-air-pollution-image-classification
    results:
      - task:
          name: Image Classification
          type: image-classification
        dataset:
          name: imagefolder
          type: imagefolder
          config: default
          split: train
          args: default
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.8166

RGAI Air Pollution Image Classification by Aswin Candra

Example Usage

First of all, clone this repo.

from aswin_air_pollution import CustomCNN

model = CustomCNN.from_pretrained('aswincandra/rgai-air-pollution-image-classification')

Notebook example: here

Model description

This model is trying to reproduce the architecture of Utomo Sapdo et al. on the Air Pollution Image Dataset from India and Nepal Kaggle dataset. It achieves the following results on the testing set:

  • Loss: 0.5276
  • Accuracy: 0.8166

Architecture

Quoted from Utomo Sapdo et al.

The proposed model accepts (224 x 224 x 3) RGB images as inputs. The initial model block is comprised of two CNN layers with 64 filters and one maxpooling layer. The second model block contains two CNN layers with 128 filters and one maxpooling layer. The third through fifth blocks use modified residual blocks that only apply one CNN layer, with the output of that CNN layer being added to the previous maxpooling output before being transmitted to the maxpooling layer. All maxpooling layers employ a kernel size of 3x3. Instead of ReLU, the activation function for all CNN layers is LeakyReLU. These five blocks are utilized for image feature extraction. The extracted features will then be flattened and transmitted to FC layer sets. The first FC layer consists of 256 neurons, whereas the second FC layer consists of 128 neurons. Additionally, these two FC layers use LeakyReLU as an activation function.

Then the output from the last layer that described above fed to final Fully-connected Layer with 6 outputs, depicts the number of class labels.

Output labels dictionary

  • '0': 'a_Good'
  • '1': 'b_Moderate'
  • '2': 'c_Unhealthy_for_Sensitive_Groups'
  • '3': 'd_Unhealthy'
  • '4': 'e_Very_Unhealthy'
  • '5': 'f_Severe'

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-4
  • train_batch_size: 16
  • eval_batch_size: 16
  • optimizer: Adam
  • num_epochs: 15

Training results

Epoch Training Loss Training Accuracy Validation Loss Validation Accuracy
1 1.6998 0.2595 1.5246 0.3568
2 1.4923 0.3758 1.4040 0.4375
3 1.3921 0.4374 1.2898 0.4911
4 1.2737 0.5020 1.1851 0.5232
5 1.1706 0.5424 1.1138 0.5738
6 1.0749 0.5842 1.0104 0.6182
7 0.9780 0.6256 0.9365 0.6452
8 0.8919 0.6637 0.8426 0.6998
9 0.8184 0.7034 0.8146 0.7029
10 0.7486 0.7286 0.7454 0.7494
11 0.6851 0.7560 0.6980 0.7560
12 0.6305 0.7759 0.6384 0.7744
13 0.5859 0.7933 0.5911 0.7922
14 0.5358 0.8141 0.5786 0.7963
15 0.4971 0.8270 0.5441 0.8142