CLIP

Contrastive Language-Image Pretraining (CLIP) model pre-trained on LAION-2B at resolution 224x224. It was introduced in the paper Learning Transferable Visual Models From Natural Language Supervision and further reproduced in the follow-up paper Reproducible scaling laws for contrastive language-image learning. The weights were converted from the laion/CLIP-ViT-g-14-laion2B-s34B-b88K presented in the OpenCLIP LAION-2B collections.

Downloads last month
7
Safetensors
Model size
1.37B params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including cs-giung/clip-vit-giant-patch14-laion2b