Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -2,4 +2,52 @@
 license: mit
 ---
 Model training details and data will be uploaded soon!

 license: mit
 ---
+## Usage
+Code example
+```python
+import torch.nn.functional as F
+from torch import Tensor
+from transformers import AutoTokenizer, AutoModel
+def average_pool(last_hidden_states: Tensor,
+                 attention_mask: Tensor) -> Tensor:
+    last_hidden = last_hidden_states.masked_fill(~attention_mask[..., None].bool(), 0.0)
+    return last_hidden.sum(dim=1) / attention_mask.sum(dim=1)[..., None]
+input_texts = [
+    "what is the capital of Japan?",
+    "Kyoto",
+    "Tokyo",
+    "Beijing"
+]
+tokenizer = AutoTokenizer.from_pretrained("iamgroot42/rover_nexus")
+model = AutoModel.from_pretrained("iamgroot42/rover_nexus")
+# Tokenize the input texts
+batch_dict = tokenizer(input_texts, max_length=512, padding=True, truncation=True, return_tensors='pt')
+outputs = model(**batch_dict)
+embeddings = average_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
+# (Optionally) normalize embeddings
+embeddings = F.normalize(embeddings, p=2, dim=1)
+scores = (embeddings[:1] @ embeddings[1:].T) * 100
+print(scores.tolist())
+```
+Use with sentence-transformers:
+```python
+from sentence_transformers import SentenceTransformer
+from sentence_transformers.util import cos_sim
+sentences = ['That is a happy person', 'That is a sad person']
+model = SentenceTransformer('iamgroot42/rover_nexus')
+embeddings = model.encode(sentences)
+print(cos_sim(embeddings[0], embeddings[1]))
+```
 Model training details and data will be uploaded soon!