Update README.md
Browse files
README.md
CHANGED
@@ -179,6 +179,14 @@ CLIP-like models have established themselves as the backbone for general-purpose
|
|
179 |
|
180 |
An updated version of our [technical report](https://arxiv.org/abs/2405.20204) with details on `jina-clip-v2` is coming soon. Stay tuned!
|
181 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
182 |
|
183 |
## Usage
|
184 |
|
@@ -389,13 +397,6 @@ _, _, text_embeddings, image_embeddings = output
|
|
389 |
|
390 |
</details>
|
391 |
|
392 |
-
### On CUDA devices
|
393 |
-
|
394 |
-
On a CUDA enabled torch environment, the model comes in `torch.bfloat16`
|
395 |
-
precision by default. When running on CUDA, it is recommended to install
|
396 |
-
[FlashAttention](https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features)
|
397 |
-
and [xFormers](https://github.com/facebookresearch/xformers?tab=readme-ov-file#installing-xformers)
|
398 |
-
to make use of their efficient attention mechanism implementations.
|
399 |
|
400 |
|
401 |
## License
|
|
|
179 |
|
180 |
An updated version of our [technical report](https://arxiv.org/abs/2405.20204) with details on `jina-clip-v2` is coming soon. Stay tuned!
|
181 |
|
182 |
+
## Faster Inference: FA2, XFormers and bf16
|
183 |
+
|
184 |
+
On a CUDA enabled torch environment, the model comes in `torch.bfloat16`
|
185 |
+
precision by default. It is highly recommended to install
|
186 |
+
[FlashAttention](https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features)
|
187 |
+
and [xFormers](https://github.com/facebookresearch/xformers?tab=readme-ov-file#installing-xformers)
|
188 |
+
to make use of their efficient attention mechanism implementations.
|
189 |
+
|
190 |
|
191 |
## Usage
|
192 |
|
|
|
397 |
|
398 |
</details>
|
399 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
400 |
|
401 |
|
402 |
## License
|