Xenova HF staff commited on
Commit
a6fe346
·
verified ·
1 Parent(s): 1299fb8

Add Transformers.js tags + sample code

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -8,6 +8,7 @@ pipeline_tag: sentence-similarity
8
  library_name: transformers
9
  tags:
10
  - sentence-transformers
 
11
  ---
12
 
13
  # gte-reranker-modernbert-base
@@ -96,6 +97,37 @@ print(scores)
96
  # NOTE: Sentence Transformers calls Softmax over the outputs by default, hence the scores are in [0, 1] range.
97
  ```
98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  ## Training Details
100
 
101
  The `gte-modernbert` series of models follows the training scheme of the previous [GTE models](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469), with the only difference being that the pre-training language model base has been replaced from [GTE-MLM](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base) to [ModernBert](https://huggingface.co/answerdotai/ModernBERT-base). For more training details, please refer to our paper: [mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval](https://aclanthology.org/2024.emnlp-industry.103/)
 
8
  library_name: transformers
9
  tags:
10
  - sentence-transformers
11
+ - transformers.js
12
  ---
13
 
14
  # gte-reranker-modernbert-base
 
97
  # NOTE: Sentence Transformers calls Softmax over the outputs by default, hence the scores are in [0, 1] range.
98
  ```
99
 
100
+ Use with `transformers.js`
101
+ ```js
102
+ import {
103
+ AutoTokenizer,
104
+ AutoModelForSequenceClassification,
105
+ } from "@huggingface/transformers";
106
+
107
+ const model_id = "Alibaba-NLP/gte-reranker-modernbert-base";
108
+ const model = await AutoModelForSequenceClassification.from_pretrained(
109
+ model_id,
110
+ { dtype: "fp32" }, // Supported options: "fp32", "fp16", "q8", "q4", "q4f16"
111
+ );
112
+ const tokenizer = await AutoTokenizer.from_pretrained(model_id);
113
+
114
+ const pairs = [
115
+ ["what is the capital of China?", "Beijing"],
116
+ ["how to implement quick sort in python?", "Introduction of quick sort"],
117
+ ["how to implement quick sort in python?", "The weather is nice today"],
118
+ ];
119
+ const inputs = tokenizer(
120
+ pairs.map((x) => x[0]),
121
+ {
122
+ text_pair: pairs.map((x) => x[1]),
123
+ padding: true,
124
+ truncation: true,
125
+ },
126
+ );
127
+ const { logits } = await model(inputs);
128
+ console.log(logits.tolist()); // [[2.138258218765259], [2.4609625339508057], [-1.6775450706481934]]
129
+ ```
130
+
131
  ## Training Details
132
 
133
  The `gte-modernbert` series of models follows the training scheme of the previous [GTE models](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469), with the only difference being that the pre-training language model base has been replaced from [GTE-MLM](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base) to [ModernBert](https://huggingface.co/answerdotai/ModernBERT-base). For more training details, please refer to our paper: [mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval](https://aclanthology.org/2024.emnlp-industry.103/)