zpn commited on
Commit
0c579fa
·
1 Parent(s): 78aab1d

feat: sentence transformers

Browse files
Files changed (1) hide show
  1. README.md +21 -1
README.md CHANGED
@@ -2902,7 +2902,9 @@ base_model:
2902
 
2903
  # ModernBERT Embed
2904
 
2905
- ModernBERT Embed is an embedding model trained from [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base), brining the new advances of ModernBERT to embeddings!
 
 
2906
 
2907
  ## Performance
2908
 
@@ -2958,6 +2960,24 @@ embeddings = F.normalize(embeddings, p=2, dim=1)
2958
  print(embeddings)
2959
  ```
2960
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2961
  ## Training
2962
 
2963
  Click the Nomic Atlas map below to visualize a 5M sample of our contrastive pretraining data!
 
2902
 
2903
  # ModernBERT Embed
2904
 
2905
+ ModernBERT Embed is an embedding model trained from [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base), brining the new advances of ModernBERT to embeddings!
2906
+
2907
+ Trained on the [Nomic Embed](https://arxiv.org/abs/2402.01613) weakly-supervised and supervised datasets, `modernbert-embed` also supports Matryoshka Representation Learning dimensions of 256, reducing memory by 3x with minimal performance loss.
2908
 
2909
  ## Performance
2910
 
 
2960
  print(embeddings)
2961
  ```
2962
 
2963
+ ### Sentence Transformers
2964
+
2965
+ ```python
2966
+ from sentence_transformers import SentenceTransformer
2967
+
2968
+ model = SentenceTransformer(
2969
+ "nomic-ai/modernbert-embed",
2970
+ )
2971
+
2972
+ # Verify that everything works as expected
2973
+ embeddings = model.encode(['search_query: What is TSNE?', 'search_query: Who is Laurens van der Maaten?'])
2974
+ print(embeddings.shape)
2975
+
2976
+ similarities = model.similarity(embeddings, embeddings)
2977
+ print(similarities)
2978
+ ```
2979
+
2980
+
2981
  ## Training
2982
 
2983
  Click the Nomic Atlas map below to visualize a 5M sample of our contrastive pretraining data!