Commit
Β·
55d34a0
1
Parent(s):
ea78c09
Final submission with README
Browse files- Final_model_log.png +0 -0
- Model_comparison.png +0 -0
- README.md +356 -6
Final_model_log.png
ADDED
![]() |
Model_comparison.png
ADDED
![]() |
README.md
CHANGED
@@ -1,12 +1,30 @@
|
|
1 |
# ImageNet 1k Image Classification with ResNet 50
|
2 |
|
|
|
|
|
3 |
|
4 |
## Model Architecture
|
5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
## Data Augmentations
|
8 |
|
9 |
-
|
10 |
|
11 |
### Augmentations and Hyperparameters
|
12 |
|
@@ -21,11 +39,11 @@ To enhance the model's robustness and generalization capabilities, we apply a se
|
|
21 |
- Probability: 0.5
|
22 |
|
23 |
3. **Color Jitter:**
|
24 |
-
- Brightness: 0.
|
25 |
-
- Contrast: 0.
|
26 |
-
- Saturation: 0.
|
27 |
-
- Hue: 0.
|
28 |
-
- Probability: 0.
|
29 |
|
30 |
4. **Normalization:**
|
31 |
- Mean: (0.485, 0.456, 0.406)
|
@@ -33,7 +51,339 @@ To enhance the model's robustness and generalization capabilities, we apply a se
|
|
33 |
|
34 |
These augmentations are applied only to the training dataset, while the test dataset undergoes resizing and normalization to ensure consistent evaluation metrics.
|
35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
|
37 |
## Model Results
|
38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# ImageNet 1k Image Classification with ResNet 50
|
2 |
|
3 |
+
This project is an implementation of the ResNet 50 model written and trained from scratch on the ImageNet 1k dataset. The model was run on an AWS EC2 instance (g6.4xLarge). The target for the project was to have the model reach at least 70% top-1 test accuracy.
|
4 |
+
|
5 |
|
6 |
## Model Architecture
|
7 |
|
8 |
+
The ResNet 50 model implemented in this project is constructed from scratch using PyTorch. The architecture is based on the original ResNet paper, featuring a series of bottleneck blocks and skip connections that allow for deep network training without the vanishing gradient problem.
|
9 |
+
|
10 |
+
### Key Components:
|
11 |
+
|
12 |
+
- **Bottleneck Block**: Each block consists of three convolutional layers. The first layer reduces the dimensionality, the second layer performs the main computation, and the third layer expands the dimensionality back. This design allows the network to learn complex features while maintaining computational efficiency.
|
13 |
+
|
14 |
+
- **Initial Convolution Layer**: The model begins with a 7x7 convolutional layer with a stride of 2, followed by batch normalization and a ReLU activation function. This is followed by a 3x3 max pooling layer to reduce the spatial dimensions.
|
15 |
+
|
16 |
+
- **Layer Stacking**: The network is composed of four main layers, each containing a series of bottleneck blocks and skip connections. The number of blocks in each layer is [3, 4, 6, 3], respectively. The first layer maintains the input dimensions, while subsequent layers downsample the spatial dimensions using a stride of 2 in the first block of each layer.
|
17 |
+
|
18 |
+
- **Adaptive Average Pooling**: After the final layer, an adaptive average pooling layer reduces the feature map to a 1x1 spatial dimension, preparing it for the fully connected layer.
|
19 |
+
|
20 |
+
- **Fully Connected Layer**: The final layer is a fully connected layer that outputs the class probabilities for the 1000 classes of the ImageNet 1k dataset.
|
21 |
+
|
22 |
+
This implementation leverages PyTorch's `nn.Module` to define the model structure, ensuring flexibility and ease of use. The model is designed to be trained on large-scale datasets like ImageNet, with the capability to achieve high accuracy through its deep architecture and efficient bottleneck design.
|
23 |
+
|
24 |
|
25 |
## Data Augmentations
|
26 |
|
27 |
+
The augmentations are inspired by the original ResNet paper and implemented using the albumentations library. The augmentations include random resized cropping, horizontal flipping, and color jittering, followed by normalization. These transformations help the model learn invariant features and improve performance on unseen data.
|
28 |
|
29 |
### Augmentations and Hyperparameters
|
30 |
|
|
|
39 |
- Probability: 0.5
|
40 |
|
41 |
3. **Color Jitter:**
|
42 |
+
- Brightness: 0.2
|
43 |
+
- Contrast: 0.2
|
44 |
+
- Saturation: 0.2
|
45 |
+
- Hue: 0.05
|
46 |
+
- Probability: 0.5
|
47 |
|
48 |
4. **Normalization:**
|
49 |
- Mean: (0.485, 0.456, 0.406)
|
|
|
51 |
|
52 |
These augmentations are applied only to the training dataset, while the test dataset undergoes resizing and normalization to ensure consistent evaluation metrics.
|
53 |
|
54 |
+
## Model Hyperparameters
|
55 |
+
|
56 |
+
The training of the ResNet 50 model involves several key hyperparameters that are crucial for optimizing performance and ensuring efficient learning. These hyperparameters are defined and utilized across various scripts in the project.
|
57 |
+
|
58 |
+
### Loss Function
|
59 |
+
|
60 |
+
- **Cross-Entropy Loss**: The model uses the cross-entropy loss function, which is suitable for multi-class classification problems like ImageNet. This loss function measures the performance of the model by comparing the predicted class probabilities with the true class labels.
|
61 |
+
|
62 |
+
### Optimizer
|
63 |
+
|
64 |
+
- **Stochastic Gradient Descent (SGD)**: The optimizer used is SGD, which is a popular choice for training deep learning models. It is configured with the following parameters:
|
65 |
+
- Learning Rate (`lr`): 0.001
|
66 |
+
- Momentum: 0.9
|
67 |
+
- Weight Decay: 0.0001
|
68 |
+
|
69 |
+
These parameters help in controlling the update steps during training, with momentum aiding in accelerating the optimizer in the relevant direction and dampening oscillations.
|
70 |
+
|
71 |
+
### Learning Rate Scheduler
|
72 |
+
|
73 |
+
- **One-Cycle Learning Rate Scheduler**: The One-Cycle LR scheduler is employed to adjust the learning rate dynamically during training. It helps in achieving better convergence by initially increasing the learning rate and then gradually decreasing it. The maximum learning rate (`max_lr`) is set to 0.01, and the scheduler is configured to run for the entire training duration.
|
74 |
+
|
75 |
+
These hyperparameters are defined in the `main.py` script, where the model, optimizer, and scheduler are initialized and used during the training loop.
|
76 |
+
|
77 |
|
78 |
## Model Results
|
79 |
|
80 |
+
The model after 39 epochs of training achieved the following accuracies -
|
81 |
+
|
82 |
+
**Top-1 Accuracy**
|
83 |
+
|
84 |
+
- Train accuracy = 67.30%
|
85 |
+
|
86 |
+
- Test accuracy = 70.27%
|
87 |
+
|
88 |
+
**Top-5 Accuracy**
|
89 |
+
|
90 |
+
- Train accuracy = 86.46%
|
91 |
+
|
92 |
+
- Test accuracy = 89.94%
|
93 |
+
|
94 |
+
The test accuracy here is measured on the validation dataset of ImageNet 1k, as the labels for the actual test dataset of ImageNet 1k are not publicly available.
|
95 |
+
|
96 |
+
### Experimentation
|
97 |
+
|
98 |
+
We used checkpoints throughout the training, saving the best performing model from every experiment. The next experiment would start with this checkpoint. The final model is hence a culmination of all the experiments.
|
99 |
+
|
100 |
+
Two key checkpoints were at 20 epochs and at 30 epochs, and the effects of the changes can be distinctly seen in the model graphs. We noticed after 20 epochs that the model was underfitting, while when we had run the model without any augmentations (not shown in these graphs and logs), the model was already overfitting after 5 epochs. We noticed that the model was unable to reduce the underfit, as the delta (train - test accuracy) was not decreasing monotonically but oscillating. We concluded the augmentation was too strong, and hence reduced the jitter augmentation hyperparameters after 20 epochs (brightness, contrast, saturation and hue). This had a favorable impact on model performance, as there was a sharp jump in training and test accuracies at this point, followed by a steadily decreasing delta. Following this, at 30 epochs, we reduced the jitter augmentation further (probability hyperparameter) to hasten up the convergence of the model to the target, as well as added the One Cycle LR scheduler. At 38 epochs, we could hit the target of 70% top-1 test accuracy.
|
101 |
+
|
102 |
+
## Visualizations
|
103 |
+
|
104 |
+
![Final model log](Final_model_log.png)
|
105 |
+
|
106 |
+
Final model log - notice the change in the log at the epochs where models were changed.
|
107 |
+
|
108 |
+
![Model comparison](Model_comparison.png)
|
109 |
+
|
110 |
+
Delta (train-test) accuracy log - note the model was unable to reduce underfitting until 20 epochs, and how reducing the augmentation after that helped the model to converge.
|
111 |
+
|
112 |
+
## Model Logs
|
113 |
+
|
114 |
+
Checkpoint loaded, resuming from epoch 1
|
115 |
+
Epoch 1 | Loss: 4.1308 | Top-1 Acc: 18.95 | Top-5 Acc: 40.25:
|
116 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
117 |
+
10010/10010 [55:03<00:00, 3.03it/s]
|
118 |
+
Test Loss: 3.4872, Top-1 Accuracy: 26.32, Top-5 Accuracy: 51.94
|
119 |
+
Epoch 1 | Train Top-1 Acc: 18.95 | Test Top-1 Acc: 26.32
|
120 |
+
Checkpoint saved at epoch 1
|
121 |
+
|
122 |
+
Epoch 2 | Loss: 3.4660 | Top-1 Acc: 28.50 | Top-5 Acc: 52.89:
|
123 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
124 |
+
10010/10010 [54:59<00:00, 3.03it/s]
|
125 |
+
Test Loss: 3.0054, Top-1 Accuracy: 34.55, Top-5 Accuracy: 61.18
|
126 |
+
Epoch 2 | Train Top-1 Acc: 28.50 | Test Top-1 Acc: 34.55
|
127 |
+
Checkpoint saved at epoch 2
|
128 |
+
|
129 |
+
Epoch 3 | Loss: 3.1044 | Top-1 Acc: 34.40 | Top-5 Acc: 59.59:
|
130 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
131 |
+
10010/10010 [55:06<00:00, 3.03it/s]
|
132 |
+
Test Loss: 2.6458, Top-1 Accuracy: 40.45, Top-5 Accuracy: 67.49
|
133 |
+
Epoch 3 | Train Top-1 Acc: 34.40 | Test Top-1 Acc: 40.45
|
134 |
+
Checkpoint saved at epoch 3
|
135 |
+
|
136 |
+
Epoch 4 | Loss: 2.8763 | Top-1 Acc: 38.37 | Top-5 Acc: 63.71:
|
137 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
138 |
+
10010/10010 [54:59<00:00, 3.03it/s]
|
139 |
+
Test Loss: 2.4953, Top-1 Accuracy: 43.46, Top-5 Accuracy: 70.21
|
140 |
+
Epoch 4 | Train Top-1 Acc: 38.37 | Test Top-1 Acc: 43.46
|
141 |
+
Checkpoint saved at epoch 4
|
142 |
+
|
143 |
+
Epoch 5 | Loss: 2.7141 | Top-1 Acc: 41.27 | Top-5 Acc: 66.46:
|
144 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
145 |
+
10010/10010 [55:07<00:00, 3.03it/s]
|
146 |
+
Test Loss: 2.3763, Top-1 Accuracy: 45.35, Top-5 Accuracy: 72.20
|
147 |
+
Epoch 5 | Train Top-1 Acc: 41.27 | Test Top-1 Acc: 45.35
|
148 |
+
Checkpoint saved at epoch 5
|
149 |
+
|
150 |
+
Epoch 6 | Loss: 2.5956 | Top-1 Acc: 43.44 | Top-5 Acc: 68.52:
|
151 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
152 |
+
10010/10010 [55:05<00:00, 3.03it/s]
|
153 |
+
Test Loss: 2.2087, Top-1 Accuracy: 48.92, Top-5 Accuracy: 74.94
|
154 |
+
Epoch 6 | Train Top-1 Acc: 43.44 | Test Top-1 Acc: 48.92
|
155 |
+
Checkpoint saved at epoch 6
|
156 |
+
|
157 |
+
Epoch 7 | Loss: 2.5062 | Top-1 Acc: 45.17 | Top-5 Acc: 70.01:
|
158 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
159 |
+
10010/10010 [55:00<00:00, 3.03it/s]
|
160 |
+
Test Loss: 2.1293, Top-1 Accuracy: 50.39, Top-5 Accuracy: 76.31
|
161 |
+
Epoch 7 | Train Top-1 Acc: 45.17 | Test Top-1 Acc: 50.39
|
162 |
+
Checkpoint saved at epoch 7
|
163 |
+
|
164 |
+
Epoch 8 | Loss: 2.4347 | Top-1 Acc: 46.44 | Top-5 Acc: 71.23:
|
165 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
166 |
+
10010/10010 [55:08<00:00, 3.03it/s]
|
167 |
+
Test Loss: 2.0405, Top-1 Accuracy: 51.67, Top-5 Accuracy: 77.95
|
168 |
+
Epoch 8 | Train Top-1 Acc: 46.44 | Test Top-1 Acc: 51.67
|
169 |
+
Checkpoint saved at epoch 8
|
170 |
+
|
171 |
+
Epoch 9 | Loss: 2.3718 | Top-1 Acc: 47.69 | Top-5 Acc: 72.29:
|
172 |
+
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
173 |
+
10010/10010 [55:03<00:00, 3.03it/s]
|
174 |
+
Test Loss: 1.9893, Top-1 Accuracy: 52.78, Top-5 Accuracy: 78.42
|
175 |
+
Epoch 9 | Train Top-1 Acc: 47.69 | Test Top-1 Acc: 52.78
|
176 |
+
Checkpoint saved at epoch 9
|
177 |
+
|
178 |
+
Epoch 10 | Loss: 2.3219 | Top-1 Acc: 48.60 | Top-5 Acc: 73.08:
|
179 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
180 |
+
10010/10010 [55:15<00:00, 3.02it/s]
|
181 |
+
Test Loss: 2.0084, Top-1 Accuracy: 52.50, Top-5 Accuracy: 78.19
|
182 |
+
Epoch 10 | Train Top-1 Acc: 48.60 | Test Top-1 Acc: 52.50
|
183 |
+
Checkpoint saved at epoch 10
|
184 |
+
|
185 |
+
Epoch 11 | Loss: 2.2819 | Top-1 Acc: 49.38 | Top-5 Acc: 73.73:
|
186 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
187 |
+
10010/10010 [55:10<00:00, 3.02it/s]
|
188 |
+
Test Loss: 1.9478, Top-1 Accuracy: 54.22, Top-5 Accuracy: 79.29
|
189 |
+
Epoch 11 | Train Top-1 Acc: 49.38 | Test Top-1 Acc: 54.22
|
190 |
+
Checkpoint saved at epoch 11
|
191 |
+
|
192 |
+
Epoch 12 | Loss: 2.2439 | Top-1 Acc: 50.13 | Top-5 Acc: 74.37:
|
193 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
194 |
+
10010/10010 [55:07<00:00, 3.03it/s]
|
195 |
+
Test Loss: 1.8487, Top-1 Accuracy: 55.76, Top-5 Accuracy: 80.88
|
196 |
+
Epoch 12 | Train Top-1 Acc: 50.13 | Test Top-1 Acc: 55.76
|
197 |
+
Checkpoint saved at epoch 12
|
198 |
+
|
199 |
+
Epoch 13 | Loss: 2.2105 | Top-1 Acc: 50.81 | Top-5 Acc: 74.89:
|
200 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
201 |
+
10010/10010 [55:04<00:00, 3.03it/s]
|
202 |
+
Test Loss: 1.8293, Top-1 Accuracy: 55.94, Top-5 Accuracy: 81.07
|
203 |
+
Epoch 13 | Train Top-1 Acc: 50.81 | Test Top-1 Acc: 55.94
|
204 |
+
Checkpoint saved at epoch 13
|
205 |
+
|
206 |
+
Epoch 14 | Loss: 2.1846 | Top-1 Acc: 51.25 | Top-5 Acc: 75.27:
|
207 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
208 |
+
10010/10010 [55:04<00:00, 3.03it/s]
|
209 |
+
Test Loss: 1.8419, Top-1 Accuracy: 56.05, Top-5 Accuracy: 81.10
|
210 |
+
Epoch 14 | Train Top-1 Acc: 51.25 | Test Top-1 Acc: 56.05
|
211 |
+
Checkpoint saved at epoch 14
|
212 |
+
|
213 |
+
Epoch 15 | Loss: 2.1587 | Top-1 Acc: 51.81 | Top-5 Acc: 75.66:
|
214 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββοΏ½οΏ½οΏ½βββββββββββββββ|
|
215 |
+
10010/10010 [55:15<00:00, 3.02it/s]
|
216 |
+
Test Loss: 1.8308, Top-1 Accuracy: 56.21, Top-5 Accuracy: 81.08
|
217 |
+
Epoch 15 | Train Top-1 Acc: 51.81 | Test Top-1 Acc: 56.21
|
218 |
+
Checkpoint saved at epoch 15
|
219 |
+
|
220 |
+
Epoch 16 | Loss: 2.1365 | Top-1 Acc: 52.22 | Top-5 Acc: 75.97:
|
221 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
222 |
+
10010/10010 [55:05<00:00, 3.03it/s]
|
223 |
+
Test Loss: 1.7530, Top-1 Accuracy: 57.90, Top-5 Accuracy: 82.15
|
224 |
+
Epoch 16 | Train Top-1 Acc: 52.22 | Test Top-1 Acc: 57.90
|
225 |
+
Checkpoint saved at epoch 16
|
226 |
+
|
227 |
+
Epoch 17 | Loss: 2.1152 | Top-1 Acc: 52.67 | Top-5 Acc: 76.34:
|
228 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
229 |
+
10010/10010 [55:46<00:00, 2.99it/s]
|
230 |
+
Test Loss: 1.7318, Top-1 Accuracy: 58.22, Top-5 Accuracy: 82.60
|
231 |
+
Epoch 17 | Train Top-1 Acc: 52.67 | Test Top-1 Acc: 58.22
|
232 |
+
Checkpoint saved at epoch 17
|
233 |
+
|
234 |
+
Epoch 18 | Loss: 2.0959 | Top-1 Acc: 53.04 | Top-5 Acc: 76.69:
|
235 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
236 |
+
10010/10010 [55:10<00:00, 3.02it/s]
|
237 |
+
Test Loss: 1.7744, Top-1 Accuracy: 57.56, Top-5 Accuracy: 82.22
|
238 |
+
Epoch 18 | Train Top-1 Acc: 53.04 | Test Top-1 Acc: 57.56
|
239 |
+
Checkpoint saved at epoch 18
|
240 |
+
|
241 |
+
Epoch 19 | Loss: 2.0762 | Top-1 Acc: 53.38 | Top-5 Acc: 76.97:
|
242 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
243 |
+
10010/10010 [55:12<00:00, 3.02it/s]
|
244 |
+
Test Loss: 1.7218, Top-1 Accuracy: 58.68, Top-5 Accuracy: 82.62
|
245 |
+
Epoch 19 | Train Top-1 Acc: 53.38 | Test Top-1 Acc: 58.68
|
246 |
+
Checkpoint saved at epoch 19
|
247 |
+
|
248 |
+
Epoch 20 | Loss: 2.0584 | Top-1 Acc: 53.74 | Top-5 Acc: 77.23:
|
249 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
250 |
+
10010/10010 [55:27<00:00, 3.01it/s]
|
251 |
+
Test Loss: 1.6975, Top-1 Accuracy: 59.45, Top-5 Accuracy: 83.41
|
252 |
+
Epoch 20 | Train Top-1 Acc: 53.74 | Test Top-1 Acc: 59.45
|
253 |
+
Checkpoint saved at epoch 20
|
254 |
+
|
255 |
+
Checkpoint loaded, resuming from epoch 21
|
256 |
+
|
257 |
+
Epoch 21 | Loss: 1.6843 | Top-1 Acc: 61.39 | Top-5 Acc: 82.66:
|
258 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
259 |
+
10010/10010 [55:13<00:00, 3.02it/s]
|
260 |
+
Test Loss: 1.3510, Top-1 Accuracy: 66.61, Top-5 Accuracy: 88.13
|
261 |
+
Epoch 21 | Train Top-1 Acc: 61.39 | Test Top-1 Acc: 66.61
|
262 |
+
Checkpoint saved at epoch 21
|
263 |
+
|
264 |
+
Epoch 22 | Loss: 1.6090 | Top-1 Acc: 62.97 | Top-5 Acc: 83.74:
|
265 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
266 |
+
10010/10010 [55:25<00:00, 3.01it/s]
|
267 |
+
Test Loss: 1.3132, Top-1 Accuracy: 67.40, Top-5 Accuracy: 88.50
|
268 |
+
Epoch 22 | Train Top-1 Acc: 62.97 | Test Top-1 Acc: 67.40
|
269 |
+
Checkpoint saved at epoch 22
|
270 |
+
|
271 |
+
Epoch 23 | Loss: 1.5821 | Top-1 Acc: 63.52 | Top-5 Acc: 84.11:
|
272 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
273 |
+
10010/10010 [55:07<00:00, 3.03it/s]
|
274 |
+
Test Loss: 1.2972, Top-1 Accuracy: 67.91, Top-5 Accuracy: 88.70
|
275 |
+
Epoch 23 | Train Top-1 Acc: 63.52 | Test Top-1 Acc: 67.91
|
276 |
+
Checkpoint saved at epoch 23
|
277 |
+
|
278 |
+
Epoch 24 | Loss: 1.5596 | Top-1 Acc: 64.05 | Top-5 Acc: 84.42:
|
279 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
280 |
+
10010/10010 [55:30<00:00, 3.01it/s]
|
281 |
+
Test Loss: 1.2808, Top-1 Accuracy: 68.16, Top-5 Accuracy: 88.86
|
282 |
+
Epoch 24 | Train Top-1 Acc: 64.05 | Test Top-1 Acc: 68.16
|
283 |
+
Checkpoint saved at epoch 24
|
284 |
+
|
285 |
+
Epoch 25 | Loss: 1.5417 | Top-1 Acc: 64.41 | Top-5 Acc: 84.65:
|
286 |
+
100%|βββββββββοΏ½οΏ½οΏ½ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
287 |
+
10010/10010 [55:32<00:00, 3.00it/s]
|
288 |
+
Test Loss: 1.2718, Top-1 Accuracy: 68.48, Top-5 Accuracy: 88.98
|
289 |
+
Epoch 25 | Train Top-1 Acc: 64.41 | Test Top-1 Acc: 68.48
|
290 |
+
Checkpoint saved at epoch 25
|
291 |
+
|
292 |
+
Epoch 26 | Loss: 1.5286 | Top-1 Acc: 64.65 | Top-5 Acc: 84.83:
|
293 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
294 |
+
10010/10010 [55:49<00:00, 2.99it/s]
|
295 |
+
Test Loss: 1.2579, Top-1 Accuracy: 68.68, Top-5 Accuracy: 89.20
|
296 |
+
Epoch 26 | Train Top-1 Acc: 64.65 | Test Top-1 Acc: 68.68
|
297 |
+
Checkpoint saved at epoch 26
|
298 |
+
|
299 |
+
Epoch 27 | Loss: 1.5158 | Top-1 Acc: 64.98 | Top-5 Acc: 85.02:
|
300 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
301 |
+
10010/10010 [55:53<00:00, 2.98it/s]
|
302 |
+
Test Loss: 1.2543, Top-1 Accuracy: 68.81, Top-5 Accuracy: 89.07
|
303 |
+
Epoch 27 | Train Top-1 Acc: 64.98 | Test Top-1 Acc: 68.81
|
304 |
+
Checkpoint saved at epoch 27
|
305 |
+
|
306 |
+
Epoch 28 | Loss: 1.5033 | Top-1 Acc: 65.14 | Top-5 Acc: 85.18:
|
307 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
308 |
+
10010/10010 [55:37<00:00, 3.00it/s]
|
309 |
+
Test Loss: 1.2446, Top-1 Accuracy: 69.06, Top-5 Accuracy: 89.36
|
310 |
+
Epoch 28 | Train Top-1 Acc: 65.14 | Test Top-1 Acc: 69.06
|
311 |
+
Checkpoint saved at epoch 28
|
312 |
+
|
313 |
+
Epoch 29 | Loss: 1.4957 | Top-1 Acc: 65.39 | Top-5 Acc: 85.28:
|
314 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
315 |
+
10010/10010 [55:38<00:00, 3.00it/s]
|
316 |
+
Test Loss: 1.2412, Top-1 Accuracy: 69.10, Top-5 Accuracy: 89.23
|
317 |
+
Epoch 29 | Train Top-1 Acc: 65.39 | Test Top-1 Acc: 69.10
|
318 |
+
Checkpoint saved at epoch 29
|
319 |
+
|
320 |
+
Epoch 30 | Loss: 1.4852 | Top-1 Acc: 65.58 | Top-5 Acc: 85.40:
|
321 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
322 |
+
10010/10010 [55:26<00:00, 3.01it/s]
|
323 |
+
Test Loss: 1.2312, Top-1 Accuracy: 69.29, Top-5 Accuracy: 89.52
|
324 |
+
Epoch 30 | Train Top-1 Acc: 65.58 | Test Top-1 Acc: 69.29
|
325 |
+
Checkpoint saved at epoch 30
|
326 |
+
|
327 |
+
Checkpoint loaded, resuming from epoch 31
|
328 |
+
|
329 |
+
Epoch 31 | Loss: 1.4600 | Top-1 Acc: 66.12 | Top-5 Acc: 85.75:
|
330 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
331 |
+
10010/10010 [55:14<00:00, 3.02it/s]
|
332 |
+
Test Loss: 1.2200, Top-1 Accuracy: 69.71, Top-5 Accuracy: 89.54
|
333 |
+
Epoch 31 | Train Top-1 Acc: 66.12 | Test Top-1 Acc: 69.71
|
334 |
+
Checkpoint saved at epoch 31
|
335 |
+
|
336 |
+
Epoch 32 | Loss: 1.4530 | Top-1 Acc: 66.29 | Top-5 Acc: 85.84:
|
337 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
338 |
+
10010/10010 [55:25<00:00, 3.01it/s]
|
339 |
+
Test Loss: 1.2182, Top-1 Accuracy: 69.55, Top-5 Accuracy: 89.74
|
340 |
+
Epoch 32 | Train Top-1 Acc: 66.29 | Test Top-1 Acc: 69.55
|
341 |
+
Checkpoint saved at epoch 32
|
342 |
+
|
343 |
+
Epoch 33 | Loss: 1.4423 | Top-1 Acc: 66.52 | Top-5 Acc: 85.99:
|
344 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
345 |
+
10010/10010 [55:23<00:00, 3.01it/s]
|
346 |
+
Test Loss: 1.2073, Top-1 Accuracy: 69.72, Top-5 Accuracy: 89.78
|
347 |
+
Epoch 33 | Train Top-1 Acc: 66.52 | Test Top-1 Acc: 69.72
|
348 |
+
Checkpoint saved at epoch 33
|
349 |
+
|
350 |
+
Epoch 34 | Loss: 1.4382 | Top-1 Acc: 66.59 | Top-5 Acc: 86.04:
|
351 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
352 |
+
10010/10010 [55:26<00:00, 3.01it/s]
|
353 |
+
Test Loss: 1.2097, Top-1 Accuracy: 69.94, Top-5 Accuracy: 89.65
|
354 |
+
Epoch 34 | Train Top-1 Acc: 66.59 | Test Top-1 Acc: 69.94
|
355 |
+
Checkpoint saved at epoch 34
|
356 |
+
|
357 |
+
Epoch 35 | Loss: 1.4308 | Top-1 Acc: 66.73 | Top-5 Acc: 86.16:
|
358 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
359 |
+
10010/10010 [55:40<00:00, 3.00it/s]
|
360 |
+
Test Loss: 1.2043, Top-1 Accuracy: 69.98, Top-5 Accuracy: 89.78
|
361 |
+
Epoch 35 | Train Top-1 Acc: 66.73 | Test Top-1 Acc: 69.98
|
362 |
+
Checkpoint saved at epoch 35
|
363 |
+
|
364 |
+
Epoch 36 | Loss: 1.4247 | Top-1 Acc: 66.92 | Top-5 Acc: 86.21:
|
365 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
366 |
+
10010/10010 [55:45<00:00, 2.99it/s]
|
367 |
+
Test Loss: 1.2003, Top-1 Accuracy: 69.87, Top-5 Accuracy: 89.88
|
368 |
+
Epoch 36 | Train Top-1 Acc: 66.92 | Test Top-1 Acc: 69.87
|
369 |
+
Checkpoint saved at epoch 36
|
370 |
+
|
371 |
+
Epoch 37 | Loss: 1.4188 | Top-1 Acc: 67.02 | Top-5 Acc: 86.28:
|
372 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
373 |
+
10010/10010 [55:21<00:00, 3.01it/s]
|
374 |
+
Test Loss: 1.1959, Top-1 Accuracy: 69.92, Top-5 Accuracy: 89.90
|
375 |
+
Epoch 37 | Train Top-1 Acc: 67.02 | Test Top-1 Acc: 69.92
|
376 |
+
Checkpoint saved at epoch 37
|
377 |
+
|
378 |
+
Epoch 38 | Loss: 1.4119 | Top-1 Acc: 67.16 | Top-5 Acc: 86.41:
|
379 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
380 |
+
10010/10010 [55:19<00:00, 3.02it/s]
|
381 |
+
Test Loss: 1.1927, Top-1 Accuracy: 70.16, Top-5 Accuracy: 89.86
|
382 |
+
Epoch 38 | Train Top-1 Acc: 67.16 | Test Top-1 Acc: 70.16
|
383 |
+
Checkpoint saved at epoch 38
|
384 |
|
385 |
+
Epoch 39 | Loss: 1.4071 | Top-1 Acc: 67.30 | Top-5 Acc: 86.46:
|
386 |
+
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ|
|
387 |
+
10010/10010 [55:07<00:00, 3.03it/s]
|
388 |
+
Test Loss: 1.1876, Top-1 Accuracy: 70.27, Top-5 Accuracy: 89.94
|
389 |
+
Epoch 39 | Train Top-1 Acc: 67.30 | Test Top-1 Acc: 70.27
|