Spaces:
Sleeping
Sleeping
File size: 13,340 Bytes
597251d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 |
# -*- coding: utf-8 -*-
"""model_training.ipynb
Automatically generated by Colaboratory.
Original file is located at
https://colab.research.google.com/drive/1LgqvdLV1teCsAi6qjR_BBVt4TwX7vx9J
<a href="https://colab.research.google.com/github/gauravreddy08/food-vision/blob/main/model_training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
# **Food Vision** π
As an introductory project to myself, I built an **end-to-end CNN Image Classification Model** which identifies the food in your image.
I worked out with a pretrained Image Classification Model that comes with Keras and then retrained it on the infamous **Food101 Dataset**.
**Fun Fact :**
The Model actually beats the DeepFood Paper's model which also trained on the same dataset.
The Accuracy of [**DeepFood**](https://arxiv.org/abs/1606.05675) was **77.4%** and our model's is **85%**. Difference of **8%** ain't much but the interesting thing is, DeepFood's model took 2-3 days to train while our's was around 60min.
> **Dataset :** `Food101`
> **Model :** `EfficientNetB1`
## **Setting up the Workspace**
* Checking the GPU
* Mounting Google Drive
* Importing Tensorflow
* Importing other required Packages
### **Checking the GPU**
For this Project we will working with **Mixed Precision**. And mixed precision works best with a with a GPU with compatibility capacity **7.0+**.
At the time of writing, colab offers the following GPU's :
* Nvidia K80
* **Nvidia T4**
* Nvidia P100
Colab allocates a random GPU everytime we factory reset runtime. So you can reset the runtime till you get a **Tesla T4 GPU** as T4 GPU has a rating 7.5.
> In case using local hardware, use a GPU with rating 7.0+ for better results.
Run the below cell to see which GPU is allocated to you.
"""
!nvidia-smi -L
"""
### **Mounting Google Drive**
"""
from google.colab import drive
drive.mount('/content/drive')
"""### **Importing Tensorflow**
At the time of writing, `tesnorflow 2.5.0` has a bug with EfficientNet Models. [Click Here](https://github.com/tensorflow/tensorflow/issues/49725) to get more info about the bug. Hopefully tensorflow fixes it soon.
So the below code is used to downgrade the version to `tensorflow 2.4.1`, it will take a moment to uninstall the previous version and install our required version.
> You need to restart the **Runtime** after required version of tensorflow is installed.
**Note :** Restarting runtime won't assign you a new GPU.
"""
#!pip install tensorflow==2.4.1
import tensorflow as tf
print(tf.__version__)
"""### **Importing other required Packages**"""
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime
import os
import tensorflow_datasets as tfds
import seaborn as sn
"""#### **Importing `helper_fuctions`**
The `helper_functions.py` is a python script created by me. Which has some important functions I use frequently while building Deep Learning Models.
"""
!wget https://raw.githubusercontent.com/sg-sparsh-goyal/extras/main/helper_function.py
from helper_function import plot_loss_curves, load_and_prep_image
"""## **Getting the Data Ready**
The Dataset used is **Food101**, which is available on both Kaggle and Tensorflow.
In the below cells we will be importing Datasets from `Tensorflow Datasets` Module.
"""
# Prints list of Datasets avaible in Tensorflow Datasets Module
dataset_list = tfds.list_builders()
dataset_list[:10]
"""### **Importing Food101 Dataset**
**Disclaimer :**
The below cell will take time to run, as it will be downloading
**4.65GB data** from **Tensorflow Datasets Module**.
So do check if you have enough **Disk Space** and **Bandwidth Cap** to run the below cell.
"""
(train_data, test_data), ds_info = tfds.load(name='food101',
split=['train', 'validation'],
shuffle_files=False,
as_supervised=True,
with_info=True)
"""## **Becoming One with the Data**
One of the most important steps in building any ML or DL Model is to **become one with the data**.
Once you get the gist of what type of data your dealing with and how it is structured, everything else will fall in place.
"""
ds_info.features
class_names = ds_info.features['label'].names
class_names[:10]
train_one_sample = train_data.take(1)
train_one_sample
for image, label in train_one_sample:
print(f"""
Image Shape : {image.shape}
Image Datatype : {image.dtype}
Class : {class_names[label.numpy()]}
""")
image[:2]
tf.reduce_min(image), tf.reduce_max(image)
plt.imshow(image)
plt.title(class_names[label.numpy()])
plt.axis(False);
"""## **Preprocessing the Data**
Since we've downloaded the data from TensorFlow Datasets, there are a couple of preprocessing steps we have to take before it's ready to model.
More specifically, our data is currently:
* In `uint8` data type
* Comprised of all differnet sized tensors (different sized images)
* Not scaled (the pixel values are between 0 & 255)
Whereas, models like data to be:
* In `float32` data type
* Have all of the same size tensors (batches require all tensors have the same shape, e.g. `(224, 224, 3)`)
* Scaled (values between 0 & 1), also called normalized
To take care of these, we'll create a `preprocess_img()` function which:
* Resizes an input image tensor to a specified size using [`tf.image.resize()`](https://www.tensorflow.org/api_docs/python/tf/image/resize)
* Converts an input image tensor's current datatype to `tf.float32` using [`tf.cast()`](https://www.tensorflow.org/api_docs/python/tf/cast)
"""
def preprocess_img(image, label, img_size=224):
image = tf.image.resize(image, [img_size, img_size])
image = tf.cast(image, tf.float16)
return image, label
# Trying the preprocess function on a single image
preprocessed_img = preprocess_img(image, label)[0]
preprocessed_img
train_data = train_data.map(preprocess_img, tf.data.AUTOTUNE)
train_data = train_data.shuffle(buffer_size=1000).batch(32).prefetch(tf.data.AUTOTUNE)
test_data = test_data.map(preprocess_img, tf.data.AUTOTUNE)
test_data = test_data.batch(32)
train_data
test_data
"""## **Building the Model : EfficientNetB1**
### **Getting the Callbacks ready**
As we are dealing with a complex Neural Network (EfficientNetB0) its a good practice to have few call backs set up. Few callbacks I will be using throughtout this Notebook are :
* **TensorBoard Callback :** TensorBoard provides the visualization and tooling needed for machine learning experimentation
* **EarlyStoppingCallback :** Used to stop training when a monitored metric has stopped improving.
* **ReduceLROnPlateau :** Reduce learning rate when a metric has stopped improving.
We already have **TensorBoardCallBack** function setup in out helper function, all we have to do is get other callbacks ready.
"""
from helper_function import create_tensorboard_callback
# EarlyStopping Callback
early_stopping_callback = tf.keras.callbacks.EarlyStopping(restore_best_weights=True, patience=3, verbose=1, monitor="val_accuracy")
# ReduceLROnPlateau Callback
lower_lr = tf.keras.callbacks.ReduceLROnPlateau(factor=0.2,
monitor='val_accuracy',
min_lr=1e-7,
patience=0,
verbose=1)
"""
### **Mixed Precision Training**
Mixed precision is used for training neural networks, reducing training time and memory requirements without affecting the model performance.
More Specifically, in **Mixed Precision** we will setting global dtype as `mixed_float16`. Because modern accelerators can run operations faster in the 16-bit dtypes, as they have specialized hardware to run 16-bit computations and 16-bit dtypes can be read from memory faster.
To know more about Mixed Precision, [**click here**](https://www.tensorflow.org/guide/mixed_precision)"""
from tensorflow.keras import mixed_precision
mixed_precision.set_global_policy(policy='mixed_float16')
mixed_precision.global_policy()
"""
### **Building the Model**"""
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
# Create base model
input_shape = (224, 224, 3)
base_model = tf.keras.applications.EfficientNetB1(include_top=False)
# Input and Data Augmentation
inputs = layers.Input(shape=input_shape, name="input_layer")
x = base_model(inputs)
x = layers.GlobalAveragePooling2D(name="pooling_layer")(x)
x = layers.Dropout(.3)(x)
x = layers.Dense(len(class_names))(x)
outputs = layers.Activation("softmax")(x)
model = tf.keras.Model(inputs, outputs)
# Compiling the model
model.compile(loss="sparse_categorical_crossentropy",
optimizer=tf.keras.optimizers.Adam(0.001),
metrics=["accuracy"])
model.summary()
history = model.fit(train_data,
epochs=50,
steps_per_epoch=len(train_data),
validation_data=test_data,
validation_steps=int(0.15 * len(test_data)),
callbacks=[create_tensorboard_callback("training-logs", "EfficientNetB1-"),
early_stopping_callback,
lower_lr])
# Saving the model
model.save("/content/drive/My Drive/FinalModel.hdf5")
# Saving the model
model.save("FoodVision.hdf5")
plot_loss_curves(history)
model.evaluate(test_data)
"""## **Evaluating our Model**"""
# Commented out IPython magic to ensure Python compatibility.
# %load_ext tensorboard
# %tensorboard --logdir training-logs
pred_probs = model.predict(test_data, verbose=1)
len(pred_probs), pred_probs.shape
pred_classes = pred_probs.argmax(axis=1)
pred_classes[:10], len(pred_classes), pred_classes.shape
# Getting true labels for the test_data
y_labels = []
test_images = []
for images, labels in test_data.unbatch():
y_labels.append(labels.numpy())
y_labels[:10]
# Predicted Labels vs. True Labels
pred_classes==y_labels
"""### **Sklearn's Accuracy Score**"""
from sklearn.metrics import accuracy_score
sklearn_acc = accuracy_score(y_labels, pred_classes)
sklearn_acc
"""### **Confusion Matrix**
A confusion matrix is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known
"""
cm = tf.math.confusion_matrix(y_labels, pred_classes)
plt.figure(figsize = (100, 100));
sn.heatmap(cm, annot=True,
fmt='',
cmap='Purples');
"""### **Model's Class-wise Accuracy Score**"""
from sklearn.metrics import classification_report
report = (classification_report(y_labels, pred_classes, output_dict=True))
# Create empty dictionary
class_f1_scores = {}
# Loop through classification report items
for k, v in report.items():
if k == "accuracy": # stop once we get to accuracy key
break
else:
# Append class names and f1-scores to new dictionary
class_f1_scores[class_names[int(k)]] = v["f1-score"]
class_f1_scores
report_df = pd.DataFrame(class_f1_scores, index = ['f1-scores']).T
report_df = report_df.sort_values("f1-scores", ascending=True)
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(12, 25))
scores = ax.barh(range(len(report_df)), report_df["f1-scores"].values)
ax.set_yticks(range(len(report_df)))
plt.axvline(x=0.85, linestyle='--', color='r')
ax.set_yticklabels(class_names)
ax.set_xlabel("f1-score")
ax.set_title("F1-Scores for 10 Different Classes")
ax.invert_yaxis(); # reverse the order
"""### **Predicting on our own Custom images**
Once we have our model ready, its cruicial to evaluate it on our custom data : the data our model has never seen.
Training and evaluating a model on train and test data is cool, but making predictions on our own realtime images is another level.
"""
import os
directory_path = "/content/drive/MyDrive/FoodVisionModels/Custom Images"
os.makedirs(directory_path, exist_ok=True)
custom_food_images = [directory_path + img_path for img_path in os.listdir(directory_path)]
custom_food_images
import os
import matplotlib.pyplot as plt
def pred_plot_custom(folder_path):
custom_food_images = [folder_path + img_path for img_path in os.listdir(folder_path) if os.path.isfile(os.path.join(folder_path, img_path))]
for img in custom_food_images:
img = load_and_prep_image(img, scale=False)
pred_prob = model.predict(tf.expand_dims(img, axis=0))
pred_class = class_names[pred_prob.argmax()]
top_5_i = (pred_prob.argsort())[0][-5:][::-1]
values = pred_prob[0][top_5_i]
labels = []
for x in range(5):
labels.append(class_names[top_5_i[x]])
fig, ax = plt.subplots(1, 2, figsize=(15, 5))
# Plotting Image
ax[0].imshow(img/255.)
ax[0].set_title(f"Prediction: {pred_class} Probability: {pred_prob.max():.2f}")
ax[0].axis('off')
# Plotting Models Top 5 Predictions
ax[1].bar(labels, values, color='orange')
ax[1].set_title('Top 5 Predictions')
plt.show()
pred_plot_custom("/content/drive/MyDrive/FoodVisionModels/Custom Images/")
|