atiwari751 commited on
Commit
8c2dd79
·
1 Parent(s): 55d34a0

Final model with README

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -97,7 +97,7 @@ The test accuracy here is measured on the validation dataset of ImageNet 1k, as
97
 
98
  We used checkpoints throughout the training, saving the best performing model from every experiment. The next experiment would start with this checkpoint. The final model is hence a culmination of all the experiments.
99
 
100
- Two key checkpoints were at 20 epochs and at 30 epochs, and the effects of the changes can be distinctly seen in the model graphs. We noticed after 20 epochs that the model was underfitting, while when we had run the model without any augmentations (not shown in these graphs and logs), the model was already overfitting after 5 epochs. We noticed that the model was unable to reduce the underfit, as the delta (train - test accuracy) was not decreasing monotonically but oscillating. We concluded the augmentation was too strong, and hence reduced the jitter augmentation hyperparameters after 20 epochs (brightness, contrast, saturation and hue). This had a favorable impact on model performance, as there was a sharp jump in training and test accuracies at this point, followed by a steadily decreasing delta. Following this, at 30 epochs, we reduced the jitter augmentation further (probability hyperparameter) to hasten up the convergence of the model to the target, as well as added the One Cycle LR scheduler. At 38 epochs, we could hit the target of 70% top-1 test accuracy.
101
 
102
  ## Visualizations
103
 
 
97
 
98
  We used checkpoints throughout the training, saving the best performing model from every experiment. The next experiment would start with this checkpoint. The final model is hence a culmination of all the experiments.
99
 
100
+ Two key checkpoints were at 20 epochs and at 30 epochs, and the effects of the changes can be distinctly seen in the model graphs. We noticed after 20 epochs that the model was underfitting, while when we had run the model without any augmentations (not shown in these graphs and logs), the model was already overfitting after 5 epochs. We noticed that the model was unable to reduce the underfit, as the delta (train - test accuracy) was not decreasing monotonically but oscillating. We concluded the augmentation was too strong, and hence reduced the jitter augmentation hyperparameters after 20 epochs (brightness, contrast, saturation and hue). We also added the One Cycle LR scheduler at this point. Both these changes had a favorable impact on model performance, as there was a sharp jump in training and test accuracies at this point (likely due to smaller learning rate applied at start of One Cycle LR), followed by a steadily decreasing delta (likely due to reduced jitter). Following this, at 30 epochs, we reduced the jitter augmentation further (probability hyperparameter). At 38 epochs, we could hit the target of 70% top-1 test accuracy.
101
 
102
  ## Visualizations
103