atiwari751
/

ResNet50_replicate

Ubuntu commited on Jan 3

Commit

2959565

1 Parent(s): d759493

Added augmentations as per the original paper

Files changed (2) hide show

README.md CHANGED Viewed

@@ -6,6 +6,32 @@
 ## Data Augmentations
 ## Model Results

 ## Data Augmentations
+To enhance the model's robustness and generalization capabilities, we apply a series of data augmentations to the training dataset. These augmentations are inspired by the original ResNet paper and implemented using the albumentations library. The augmentations include random resized cropping, horizontal flipping, and color jittering, followed by normalization. These transformations help the model learn invariant features and improve performance on unseen data.
+### Augmentations and Hyperparameters
+1. **Random Resized Crop:**
+   - Height: 224
+   - Width: 224
+   - Scale: (0.08, 1.0)
+   - Aspect Ratio: (3/4, 4/3)
+   - Probability: 1.0
+2. **Horizontal Flip:**
+   - Probability: 0.5
+3. **Color Jitter:**
+   - Brightness: 0.4
+   - Contrast: 0.4
+   - Saturation: 0.4
+   - Hue: 0.1
+   - Probability: 0.8
+4. **Normalization:**
+   - Mean: (0.485, 0.456, 0.406)
+   - Standard Deviation: (0.229, 0.224, 0.225)
+These augmentations are applied only to the training dataset, while the test dataset undergoes resizing and normalization to ensure consistent evaluation metrics.
 ## Model Results

resnet_execute.py CHANGED Viewed

@@ -10,20 +10,31 @@ from torchvision import datasets
 from checkpoint import save_checkpoint, load_checkpoint
 import matplotlib.pyplot as plt
 from torchvision.utils import make_grid
 # Define transformations
-transform = transforms.Compose([
-    transforms.Resize(256),  # Resize the smaller side to 256 pixels while keeping aspect ratio
-    transforms.CenterCrop(224),  # Then crop to 224x224 pixels from the center
-    transforms.ToTensor(),
-    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])  # ImageNet normalization
 ])
 # Train dataset and loader
-trainset = datasets.ImageFolder(root='/mnt/imagenet/ILSVRC/Data/CLS-LOC/train', transform=transform)
 trainloader = DataLoader(trainset, batch_size=128, shuffle=True, num_workers=16, pin_memory=True)
-testset = datasets.ImageFolder(root='/mnt/imagenet/ILSVRC/Data/CLS-LOC/val', transform=transform )
 testloader = DataLoader(testset, batch_size=1000, shuffle=False, num_workers=16, pin_memory=True)
 # Initialize model, loss function, and optimizer

 from checkpoint import save_checkpoint, load_checkpoint
 import matplotlib.pyplot as plt
 from torchvision.utils import make_grid
+import albumentations as A
+from albumentations.pytorch import ToTensorV2
+import numpy as np
 # Define transformations
+train_transform = A.Compose([
+    A.RandomResizedCrop(height=224, width=224, scale=(0.08, 1.0), ratio=(3/4, 4/3), p=1.0),
+    A.HorizontalFlip(p=0.5),
+    A.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0.1, p=0.8),
+    A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
+    ToTensorV2()
+])
+test_transform = A.Compose([
+    A.Resize(height=256, width=256),
+    A.CenterCrop(height=224, width=224),
+    A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
+    ToTensorV2()
 ])
 # Train dataset and loader
+trainset = datasets.ImageFolder(root='/mnt/imagenet/ILSVRC/Data/CLS-LOC/train', transform=lambda img: train_transform(image=np.array(img))['image'])
 trainloader = DataLoader(trainset, batch_size=128, shuffle=True, num_workers=16, pin_memory=True)
+testset = datasets.ImageFolder(root='/mnt/imagenet/ILSVRC/Data/CLS-LOC/val', transform=lambda img: test_transform(image=np.array(img))['image'])
 testloader = DataLoader(testset, batch_size=1000, shuffle=False, num_workers=16, pin_memory=True)
 # Initialize model, loss function, and optimizer