/AllReduce
: ResNet-50 models trained with AllReduce SGD- Training Details:
- Seeds: 810975, 810976, 810977
- Epoch: 90
- Max LR: 1.0
- LR scheduler: Cosine annealing with a linear warm-up in the first 5 epochs
- Batch size: 1024
- Momentum: 0.875
- Results:
- Top-1 Accuracy: 77.5327 ± 0.1685
- Top-5 Accuracy: 93.6127 ± 0.0998
- Val Loss: 1.9389 ± 0.0094
- Training Details:
/DSGDm-8-ring
: ResNet-50 models trained with decentralized SGD with momentum- Training Details:
- Seeds: 810975, 810976, 810977
- Epoch: 90
- Max LR: 1.0
- LR scheduler: Cosine annealing with a linear warm-up in the first 5 epochs
- Batch size: 1024
- Momentum: 0.875
- Number of workers: 8
- Communication topology: one-peer ring (time-varying topology)
- Results:
- Top-1 Accuracy: 77.4233 ± 0.1227
- Top-5 Accuracy: 93.5407 ± 0.0546
- Val Loss: 1.9332 ± 0.0031
- Training Details:
/DSGDm-8-complete
: ResNet-50 models trained with decentralized SGD with momentum- Training Details:
- Seeds: 810975, 810976, 810977
- Epoch: 90
- Max LR: 1.0
- LR scheduler: Cosine annealing with a linear warm-up in the first 5 epochs
- Batch size: 1024
- Momentum: 0.875
- Number of workers: 8
- Communication topology: complete
- Results:
- Top-1 Accuracy: 77.4440 ± 0.0694
- Top-5 Accuracy: 93.6567 ± 0.0197
- Val Loss: 1.9361 ± 0.0040
- Training Details:
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.