Files changed (1) hide show
  1. README.md +22 -7
README.md CHANGED
@@ -2621,7 +2621,7 @@ This is our base sentence embedding model. It was trained using [AnglE](https://
2621
 
2622
  ## Quickstart
2623
 
2624
- Here, we provide several ways to produce sentence embeddings. Please note that you have to provide the prompt `Represent this sentence for searching relevant passages:` for query if you want to use it for retrieval. Besides that you don't need any prompt.
2625
 
2626
  ### sentence-transformers
2627
 
@@ -2632,6 +2632,7 @@ python -m pip install -U sentence-transformers
2632
  ```python
2633
  from sentence_transformers import SentenceTransformer
2634
  from sentence_transformers.util import cos_sim
 
2635
 
2636
  # 1. load model
2637
  model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")
@@ -2652,6 +2653,16 @@ embeddings = model.encode(docs)
2652
 
2653
  similarities = cos_sim(embeddings[0], embeddings[1:])
2654
  print('similarities:', similarities)
 
 
 
 
 
 
 
 
 
 
2655
  ```
2656
  ### Transformers
2657
 
@@ -2669,7 +2680,7 @@ def transform_query(query: str) -> str:
2669
  """
2670
  return f'Represent this sentence for searching relevant passages: {query}'
2671
 
2672
- # The model works really well with cls pooling (default) but also with mean poolin.
2673
  def pooling(outputs: torch.Tensor, inputs: Dict, strategy: str = 'cls') -> np.ndarray:
2674
  if strategy == 'cls':
2675
  outputs = outputs[:, 0]
@@ -2744,7 +2755,6 @@ You can use the model via our API as follows:
2744
 
2745
  ```python
2746
  from mixedbread_ai.client import MixedbreadAI
2747
- from sklearn.metrics.pairwise import cosine_similarity
2748
  import os
2749
 
2750
  mxbai = MixedbreadAI(api_key="{MIXEDBREAD_API_KEY}")
@@ -2756,16 +2766,21 @@ english_sentences = [
2756
 
2757
  res = mxbai.embeddings(
2758
  input=english_sentences,
2759
- model="mixedbread-ai/mxbai-embed-large-v1"
 
 
 
2760
  )
2761
- embeddings = [entry.embedding for entry in res.data]
2762
 
2763
- similarities = cosine_similarity([embeddings[0]], [embeddings[1]])
2764
- print(similarities)
2765
  ```
2766
 
2767
  The API comes with native INT8 and binary quantization support! Check out the [docs](https://mixedbread.ai/docs) for more information.
2768
 
 
 
 
 
2769
  ## Evaluation
2770
  As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
2771
 
 
2621
 
2622
  ## Quickstart
2623
 
2624
+ Here, we provide several ways to produce sentence embeddings. Please note that you have to provide the prompt `Represent this sentence for searching relevant passages:` for query if you want to use it for retrieval. Besides that you don't need any prompt. Our model also supports Matryoshka Representation Learning and (binary) quantization when used via API or Sentence Transformers.
2625
 
2626
  ### sentence-transformers
2627
 
 
2632
  ```python
2633
  from sentence_transformers import SentenceTransformer
2634
  from sentence_transformers.util import cos_sim
2635
+ from sentence_transformers.util import quantize_embeddings
2636
 
2637
  # 1. load model
2638
  model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")
 
2653
 
2654
  similarities = cos_sim(embeddings[0], embeddings[1:])
2655
  print('similarities:', similarities)
2656
+
2657
+ # 2a. Encode with selection of MRL dimensions
2658
+ mrl_embeddings = model.encode(docs, normalize_embeddings=True)[..., :512]
2659
+
2660
+ mrl_similarities = cos_sim(mrl_embeddings[0], mrl_embeddings[1:])
2661
+ print('mrl_similarities:', mrl_similarities)
2662
+
2663
+ # 3. Apply binary quantization
2664
+ binary_embeddings = quantize_embeddings(embeddings, precision="binary")
2665
+ binary_mrl_embeddings = quantize_embeddings(mrl_embeddings, precision="binary")
2666
  ```
2667
  ### Transformers
2668
 
 
2680
  """
2681
  return f'Represent this sentence for searching relevant passages: {query}'
2682
 
2683
+ # The model works really well with cls pooling (default) but also with mean pooling.
2684
  def pooling(outputs: torch.Tensor, inputs: Dict, strategy: str = 'cls') -> np.ndarray:
2685
  if strategy == 'cls':
2686
  outputs = outputs[:, 0]
 
2755
 
2756
  ```python
2757
  from mixedbread_ai.client import MixedbreadAI
 
2758
  import os
2759
 
2760
  mxbai = MixedbreadAI(api_key="{MIXEDBREAD_API_KEY}")
 
2766
 
2767
  res = mxbai.embeddings(
2768
  input=english_sentences,
2769
+ model="mixedbread-ai/mxbai-embed-large-v1",
2770
+ normalized=True,
2771
+ encoding_format=['ubinary','float'],
2772
+ dimensions=512
2773
  )
 
2774
 
2775
+ print(res.dimensions, res.data[0].embedding.ubinary, res.data[0].embedding.float_)
 
2776
  ```
2777
 
2778
  The API comes with native INT8 and binary quantization support! Check out the [docs](https://mixedbread.ai/docs) for more information.
2779
 
2780
+ ### Why binary MRL?
2781
+
2782
+ The combination of binary quantization and Matryoshka Representation Learning allows you to reduce the memory usage of your embeddings significantly. This leads to much lower costs when using a vector database. You can read more about the technology and its advantages in our [blog post](https://www.mixedbread.ai/blog/binary-mrl).
2783
+
2784
  ## Evaluation
2785
  As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
2786