What is difference between this and CLIP-ViT-g-14-laion2B-s12B-b42K and CLIP-ViT-bigG-14-laion2B-39B-b160k
#1
by
AlexWortega
- opened
Can you add readme with difference?
@AlexWortega it's a different version of https://huggingface.co/laion/CLIP-ViT-g-14-laion2B-s12B-b42K not bigG (larger model), trained on same data but for more samples and w/ a larger global batch size
s34B means 34B samples seen during training, b88K means a global batch size of ~88000 instead of 12B samples and ~42000 batch size for the earlier 'g'
From https://github.com/mlfoundations/open_clip README, this is the 78.5. Should update the README here yes, but there's a big stack of TODOs :)
- ViT-H/14 on LAION-2B with an accuracy of 78.0. The second best in1k zero-shot for released, open-source weights thus far.
- ViT-g/14 on LAION-2B with an accuracy of 76.6. This was trained on reduced 12B samples seen schedule, same samples seen as 400M models.
- ViT-g/14 on LAION-2B with an accuracy of 78.5. Full 34B samples seen schedule.
- ViT-G/14 on LAION-2B with an accuracy of 80.1. The best in1k zero-shot for released, open-source weights thus far.