jupyterjazz commited on
Commit
47c6c01
·
1 Parent(s): 936ce79
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -21688,6 +21688,37 @@ embeddings = model.encode(
21688
  )
21689
  ```
21690
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21691
 
21692
 
21693
  ## Performance
 
21688
  )
21689
  ```
21690
 
21691
+ Furthermore, you can use ONNX for efficient inference with `jina-embeddings-v3`:
21692
+ ```python
21693
+ import onnxruntime
21694
+ import numpy as np
21695
+ from transformers import AutoTokenizer, PretrainedConfig
21696
+
21697
+ # Load tokenizer and model config
21698
+ tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v3')
21699
+ config = PretrainedConfig.from_pretrained('jinaai/jina-embeddings-v3')
21700
+
21701
+ # Tokenize input
21702
+ input_text = tokenizer('sample text', return_tensors='np')
21703
+
21704
+ # ONNX session
21705
+ model_path = 'jina-embeddings-v3/onnx/model.onnx'
21706
+ session = onnxruntime.InferenceSession(model_path)
21707
+
21708
+ # Prepare inputs for ONNX model
21709
+ task_type = 'text-matching'
21710
+ task_id = np.array(config.lora_adaptations.index(task_type), dtype=np.int64)
21711
+ inputs = {
21712
+ 'input_ids': input_text['input_ids'],
21713
+ 'attention_mask': input_text['attention_mask'],
21714
+ 'task_id': task_id
21715
+ }
21716
+
21717
+ # Run model
21718
+ outputs = session.run(None, inputs)
21719
+ ```
21720
+
21721
+
21722
 
21723
 
21724
  ## Performance