Feature Extraction
Transformers
clip
vision
Inference Endpoints
kimihailv commited on
Commit
fe3a8a1
1 Parent(s): 016f2b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -12,13 +12,13 @@ For Semantic Search Applications<br/>
12
  UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space!
13
  It extends the `transfromers` package to support Mid-fusion Models.
14
 
15
- This is model card of __English only model__ with:
16
 
17
  * 4 layers BERT (2 layers for unimodal encoding and rest layers for multimodal encoding)
18
  * ViT-B/16 (image resolution is 224x224)
19
 
20
 
21
- If you need multilingual model, check [this](https://huggingface.co/unum-cloud/uform-vl-multilingual).
22
 
23
  ## Installation
24
 
 
12
  UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space!
13
  It extends the `transfromers` package to support Mid-fusion Models.
14
 
15
+ This is model card of the __English only model__ with:
16
 
17
  * 4 layers BERT (2 layers for unimodal encoding and rest layers for multimodal encoding)
18
  * ViT-B/16 (image resolution is 224x224)
19
 
20
 
21
+ If you need Multilingual model, check [this](https://huggingface.co/unum-cloud/uform-vl-multilingual).
22
 
23
  ## Installation
24