Model Training Notes

Validation Accuracy: `train0`

Note: "v1" = IMAGENET1K_V1, "v2" = V2

Utilizes new labeled test set from Stanford Cars for more training data!

Hyperparameters changed: optimizer and learning rate

More data improves accuracy: All models saw substantial gains in train1 compared to train0.
Deeper models help: ResNet101 generally outperforms ResNet50.
Optimizer matters: Adam (lr=1e-4) yielded the highest accuracy; both lower/higher learning rates and SGD performed worse.
IMAGENET v1 vs v2: The difference between v1 and v2 initializations is minor compared to the effect of data volume and model size.
Performance margins: The right optimizer and learning rate can more than double validation accuracy for the same architecture.