Add Sentence Transformers snippet to README

#2
by tomaarsen HF staff - opened
Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -3223,7 +3223,7 @@ Jina Embeddings V2 [technical report](https://arxiv.org/abs/2310.19923)
3223
 
3224
  ### Why mean pooling?
3225
 
3226
- `mean poooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
3227
  It has been proved to be the most effective way to produce high-quality sentence embeddings.
3228
  We offer an `encode` function to deal with this.
3229
 
@@ -3256,7 +3256,7 @@ embeddings = F.normalize(embeddings, p=2, dim=1)
3256
  </p>
3257
  </details>
3258
 
3259
- You can use Jina Embedding models directly from transformers package:
3260
  ```python
3261
  !pip install transformers
3262
  from transformers import AutoModel
@@ -3277,7 +3277,22 @@ embeddings = model.encode(
3277
  )
3278
  ```
3279
 
3280
- ## Alternatives to Using Transformers Package
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3281
 
3282
  1. _Managed SaaS_: Get started with a free key on Jina AI's [Embedding API](https://jina.ai/embeddings/).
3283
  2. _Private and high-performance deployment_: Get started by picking from our suite of models and deploy them on [AWS Sagemaker](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy).
 
3223
 
3224
  ### Why mean pooling?
3225
 
3226
+ `mean pooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
3227
  It has been proved to be the most effective way to produce high-quality sentence embeddings.
3228
  We offer an `encode` function to deal with this.
3229
 
 
3256
  </p>
3257
  </details>
3258
 
3259
+ You can use Jina Embedding models directly from the `transformers` package:
3260
  ```python
3261
  !pip install transformers
3262
  from transformers import AutoModel
 
3277
  )
3278
  ```
3279
 
3280
+ Or you can use the model with the `sentence-transformers` package:
3281
+ ```python
3282
+ from sentence_transformers import SentenceTransformer, util
3283
+
3284
+ model = SentenceTransformer("jinaai/jina-embeddings-v2-base-es", trust_remote_code=True)
3285
+ embeddings = model.encode(['How is the weather today?', '¿Qué tiempo hace hoy?'])
3286
+ print(util.cos_sim(embeddings[0], embeddings[1]))
3287
+ ```
3288
+
3289
+ And if you only want to handle shorter sequence, such as 2k, then you can set the `model.max_seq_length`
3290
+
3291
+ ```python
3292
+ model.max_seq_length = 2048
3293
+ ```
3294
+
3295
+ ## Alternatives to Transformers and Sentence Transformers
3296
 
3297
  1. _Managed SaaS_: Get started with a free key on Jina AI's [Embedding API](https://jina.ai/embeddings/).
3298
  2. _Private and high-performance deployment_: Get started by picking from our suite of models and deploy them on [AWS Sagemaker](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy).