liuyanyi commited on
Commit
9fa755c
·
verified ·
1 Parent(s): 8f08a66

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - sentence-transformers
5
+ - feature-extraction
6
+ - sentence-similarity
7
+ ---
8
+ # BGE-M3 in HuggingFace Transformer
9
+
10
+ > **This is not an official implementation of BGE-M3. Official implementation can be found in [Flag Embedding](https://github.com/FlagOpen/FlagEmbedding) project.**
11
+
12
+ ## Introduction
13
+
14
+ Full introduction please see the github repo.
15
+
16
+ https://github.com/liuyanyi/transformers-bge-m3
17
+
18
+ ## Use BGE-M3 in HuggingFace Transformer
19
+
20
+ ```python
21
+ from transformers import AutoModel, AutoTokenizer
22
+
23
+ # Trust remote code is required to load the model
24
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
25
+ model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
26
+
27
+ input_str = "Hello, world!"
28
+ input_ids = tokenizer(input_str, return_tensors="pt", padding=True, truncation=True)
29
+
30
+ output = model(**input_ids, return_dict=True)
31
+
32
+ dense_output = output.dense_output # To align with Flag Embedding project, a normalization is required
33
+ colbert_output = output.colbert_output # To align with Flag Embedding project, a normalization is required
34
+ sparse_output = output.sparse_output
35
+ ```
36
+
37
+ ## References
38
+
39
+ - [Official BGE-M3 Weight](https://huggingface.co/BAAI/bge-m3)
40
+ - [Flag Embedding](https://github.com/FlagOpen/FlagEmbedding)
41
+ - [HuggingFace Transformer](https://github.com/huggingface/transformers)