Add new SentenceTransformer model
Browse files- .gitattributes +2 -0
- 1_Pooling/config.json +10 -0
- README.md +310 -0
- config.json +26 -0
- config_sentence_transformers.json +10 -0
- model.safetensors +3 -0
- modules.json +14 -0
- sentence_bert_config.json +4 -0
- special_tokens_map.json +51 -0
- tokenizer.json +3 -0
- tokenizer_config.json +65 -0
- unigram.json +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
37 |
+
unigram.json filter=lfs diff=lfs merge=lfs -text
|
1_Pooling/config.json
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"word_embedding_dimension": 384,
|
3 |
+
"pooling_mode_cls_token": false,
|
4 |
+
"pooling_mode_mean_tokens": true,
|
5 |
+
"pooling_mode_max_tokens": false,
|
6 |
+
"pooling_mode_mean_sqrt_len_tokens": false,
|
7 |
+
"pooling_mode_weightedmean_tokens": false,
|
8 |
+
"pooling_mode_lasttoken": false,
|
9 |
+
"include_prompt": true
|
10 |
+
}
|
README.md
ADDED
@@ -0,0 +1,310 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- sentence-transformers
|
4 |
+
- sentence-similarity
|
5 |
+
- feature-extraction
|
6 |
+
- generated_from_trainer
|
7 |
+
- dataset_size:5
|
8 |
+
- loss:CosineSimilarityLoss
|
9 |
+
base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
10 |
+
pipeline_tag: sentence-similarity
|
11 |
+
library_name: sentence-transformers
|
12 |
+
---
|
13 |
+
|
14 |
+
# SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
15 |
+
|
16 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
17 |
+
|
18 |
+
## Model Details
|
19 |
+
|
20 |
+
### Model Description
|
21 |
+
- **Model Type:** Sentence Transformer
|
22 |
+
- **Base model:** [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) <!-- at revision 8d6b950845285729817bf8e1af1861502c2fed0c -->
|
23 |
+
- **Maximum Sequence Length:** 128 tokens
|
24 |
+
- **Output Dimensionality:** 384 dimensions
|
25 |
+
- **Similarity Function:** Cosine Similarity
|
26 |
+
<!-- - **Training Dataset:** Unknown -->
|
27 |
+
<!-- - **Language:** Unknown -->
|
28 |
+
<!-- - **License:** Unknown -->
|
29 |
+
|
30 |
+
### Model Sources
|
31 |
+
|
32 |
+
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
|
33 |
+
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
|
34 |
+
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
|
35 |
+
|
36 |
+
### Full Model Architecture
|
37 |
+
|
38 |
+
```
|
39 |
+
SentenceTransformer(
|
40 |
+
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
|
41 |
+
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
|
42 |
+
)
|
43 |
+
```
|
44 |
+
|
45 |
+
## Usage
|
46 |
+
|
47 |
+
### Direct Usage (Sentence Transformers)
|
48 |
+
|
49 |
+
First install the Sentence Transformers library:
|
50 |
+
|
51 |
+
```bash
|
52 |
+
pip install -U sentence-transformers
|
53 |
+
```
|
54 |
+
|
55 |
+
Then you can load this model and run inference.
|
56 |
+
```python
|
57 |
+
from sentence_transformers import SentenceTransformer
|
58 |
+
|
59 |
+
# Download from the 🤗 Hub
|
60 |
+
model = SentenceTransformer("forestav/job_matching_sentence_transformer")
|
61 |
+
# Run inference
|
62 |
+
sentences = [
|
63 |
+
'The weather is lovely today.',
|
64 |
+
"It's so sunny outside!",
|
65 |
+
'He drove to the stadium.',
|
66 |
+
]
|
67 |
+
embeddings = model.encode(sentences)
|
68 |
+
print(embeddings.shape)
|
69 |
+
# [3, 384]
|
70 |
+
|
71 |
+
# Get the similarity scores for the embeddings
|
72 |
+
similarities = model.similarity(embeddings, embeddings)
|
73 |
+
print(similarities.shape)
|
74 |
+
# [3, 3]
|
75 |
+
```
|
76 |
+
|
77 |
+
<!--
|
78 |
+
### Direct Usage (Transformers)
|
79 |
+
|
80 |
+
<details><summary>Click to see the direct usage in Transformers</summary>
|
81 |
+
|
82 |
+
</details>
|
83 |
+
-->
|
84 |
+
|
85 |
+
<!--
|
86 |
+
### Downstream Usage (Sentence Transformers)
|
87 |
+
|
88 |
+
You can finetune this model on your own dataset.
|
89 |
+
|
90 |
+
<details><summary>Click to expand</summary>
|
91 |
+
|
92 |
+
</details>
|
93 |
+
-->
|
94 |
+
|
95 |
+
<!--
|
96 |
+
### Out-of-Scope Use
|
97 |
+
|
98 |
+
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
99 |
+
-->
|
100 |
+
|
101 |
+
<!--
|
102 |
+
## Bias, Risks and Limitations
|
103 |
+
|
104 |
+
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
|
105 |
+
-->
|
106 |
+
|
107 |
+
<!--
|
108 |
+
### Recommendations
|
109 |
+
|
110 |
+
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
111 |
+
-->
|
112 |
+
|
113 |
+
## Training Details
|
114 |
+
|
115 |
+
### Training Dataset
|
116 |
+
|
117 |
+
#### Unnamed Dataset
|
118 |
+
|
119 |
+
|
120 |
+
* Size: 5 training samples
|
121 |
+
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
122 |
+
* Approximate statistics based on the first 5 samples:
|
123 |
+
| | sentence_0 | sentence_1 | label |
|
124 |
+
|:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:--------------------------------------------------------------|
|
125 |
+
| type | string | string | float |
|
126 |
+
| details | <ul><li>min: 128 tokens</li><li>mean: 128.0 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 128 tokens</li><li>mean: 128.0 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.4</li><li>max: 1.0</li></ul> |
|
127 |
+
* Samples:
|
128 |
+
| sentence_0 | sentence_1 | label |
|
129 |
+
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
130 |
+
| <code>Filip Orestav <br>Transformatorvägen 6, Sollentuna , Sweden <br>+46 76 873 30 77 | [email protected] | LinkedIn <br> <br>Ambitious fourth -year Industrial Engineering and Management student at KTH, pursuing a Master's in Machine <br>Learning. Entrepreneurial spirit with a track record of founding a successful consulting and investment company, <br>optimizing operations at a fund company, leading people at a supermarket store and driving growth for Sweden's <br>largest youth platform. <br> <br>EDUCATION <br>KTH Royal Institute of Technology Stockholm, Sweden <br>M.Sc. Industrial Engineering and Management GPA: 4. 57/5 <br>Master in Machine Learning Expected graduation 2026 <br> <br>Rudbecksgymnasiet Stockholm , Sweden <br>Natural Sciences 21.09/22.5 <br> Graduated 2021 <br>KEY SKILLS <br>• TECHNICAL: Python, Java, JavaScript, SQL, Machine Learning, Deep Learning <br>• BUSINESS: Financial analysis, Business analysis, Consulting, Project management , Strategic planning <br>• SOFT SKILLS: Critical thinking, Problem solving, Tim...</code> | <code>Nu söker vi nya medarbetare!<br>Vänligen märk din ansökan med heltid, då vi även söker extrapersonal.<br>Söker du ett utvecklande arbete i ett expansivt företag, då kan detta vara någonting för dig! Mattvaruhuset AB är Sveriges största butikskedja för mattor och golv med enheter i Bromma, Huddinge och Danderyd. Med en internet-handel på kraftigt uppåtgående är vi en stadigt växande aktör som just nu söker nya medarbetare till vår enhet i Danderyd.<br>Som medarbetare hos oss på Mattvaruhuset är du med i teamet som ansvarar för att hantera beställningar, ta hand om våra kunder och hålla butiken i toppskick.<br>Det är viktigt att du är motiverad och engagerad och tycker om att ha mycket kundkontakt.<br>Dina huvudsakliga arbetsuppgifter är försäljning, hantera beställningar samt varuhantering.<br>För att du ska passa till arbetet så bör du ha ett gott ordningssinne, uppskatta ett högt tempo samt ha en god fysik då det förekommer tunga lyft.<br>Mattvaruhuset öppnade sitt första varuhus redan 1987 och är idag Sv...</code> | <code>0.0</code> |
|
131 |
+
| <code>Filip Orestav <br>Transformatorvägen 6, Sollentuna , Sweden <br>+46 76 873 30 77 | [email protected] | LinkedIn <br> <br>Ambitious fourth -year Industrial Engineering and Management student at KTH, pursuing a Master's in Machine <br>Learning. Entrepreneurial spirit with a track record of founding a successful consulting and investment company, <br>optimizing operations at a fund company, leading people at a supermarket store and driving growth for Sweden's <br>largest youth platform. <br> <br>EDUCATION <br>KTH Royal Institute of Technology Stockholm, Sweden <br>M.Sc. Industrial Engineering and Management GPA: 4. 57/5 <br>Master in Machine Learning Expected graduation 2026 <br> <br>Rudbecksgymnasiet Stockholm , Sweden <br>Natural Sciences 21.09/22.5 <br> Graduated 2021 <br>KEY SKILLS <br>• TECHNICAL: Python, Java, JavaScript, SQL, Machine Learning, Deep Learning <br>• BUSINESS: Financial analysis, Business analysis, Consulting, Project management , Strategic planning <br>• SOFT SKILLS: Critical thinking, Problem solving, Tim...</code> | <code>Vill du jobba på ett av Sveriges mest attraktiva företag som erbjuder en inkluderande kultur och personlig utveckling? Är du intresserad av att arbeta i en teamorienterad organisation där du får stöd och vägledning från erfarna kollegor? Starta din karriär som junior konsult inom vår Technology risk avdelning på EY! <br> <br>Vi söker nu juniora konsulter till vårt kontor i Stockholm och Göteborg med start i augusti 2025 och riktar oss mot dig som är nyutexaminerad inom områdena, ekonomi, IT-säkerhet, systemvetenskap, ingenjör eller motsvarande från universitet eller högskola. <br> <br><br><br>Din roll som konsult hos oss <br><br><br>Är du redo att ta dig an utmaningen att hjälpa organisationer att navigera genom det ständigt föränderliga landskapet av teknologiska risker? Vi söker nu drivna, analytiska och detaljorienterade konsulter som kan ansluta sig till vårt dynamiska team inom Technology Risk. Hos oss kommer du att arbeta med en mångsidig klientportfölj som sträcker sig över både den privata och offentliga...</code> | <code>1.0</code> |
|
132 |
+
| <code>Filip Orestav <br>Transformatorvägen 6, Sollentuna , Sweden <br>+46 76 873 30 77 | [email protected] | LinkedIn <br> <br>Ambitious fourth -year Industrial Engineering and Management student at KTH, pursuing a Master's in Machine <br>Learning. Entrepreneurial spirit with a track record of founding a successful consulting and investment company, <br>optimizing operations at a fund company, leading people at a supermarket store and driving growth for Sweden's <br>largest youth platform. <br> <br>EDUCATION <br>KTH Royal Institute of Technology Stockholm, Sweden <br>M.Sc. Industrial Engineering and Management GPA: 4. 57/5 <br>Master in Machine Learning Expected graduation 2026 <br> <br>Rudbecksgymnasiet Stockholm , Sweden <br>Natural Sciences 21.09/22.5 <br> Graduated 2021 <br>KEY SKILLS <br>• TECHNICAL: Python, Java, JavaScript, SQL, Machine Learning, Deep Learning <br>• BUSINESS: Financial analysis, Business analysis, Consulting, Project management , Strategic planning <br>• SOFT SKILLS: Critical thinking, Problem solving, Tim...</code> | <code>Jæren Kulde söker en erfaren och driven kyltekniker som vill ta nästa steg i karriären och kombinera tekniskt arbete med strategiskt ledarskap. Vi erbjuder en unik möjlighet att utvecklas både tekniskt och som ledare, med en tydlig målsättning att överta rollen som företagets framtida VD.<br>Om Jæren Kulde<br>Jæren Kulde är specialiserade på projektering, installation och underhåll av kyl- och frysanläggningar i varierande storlekar, från mindre lösningar till stora industriella system. Med ett starkt fokus på energieffektivitet och återvinning av överskottsvärme levererar vi hållbara lösningar till kunder inom livsmedelsindustrin och VVS-branschen. Vi är stolta över vårt innovativa arbetssätt och vårt starka engagemang för kvalitet.<br>Om tjänsten<br>I rollen som ledande kyltekniker kommer du att arbeta i nära samarbete med vår erfarna VD, som har över 50 års branscherfarenhet. Du får möjlighet att kombinera teknisk expertis med gradvis ökat ansvar för företagets strategiska utveckling. Dina huvu...</code> | <code>0.0</code> |
|
133 |
+
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
134 |
+
```json
|
135 |
+
{
|
136 |
+
"loss_fct": "torch.nn.modules.loss.MSELoss"
|
137 |
+
}
|
138 |
+
```
|
139 |
+
|
140 |
+
### Training Hyperparameters
|
141 |
+
#### Non-Default Hyperparameters
|
142 |
+
|
143 |
+
- `per_device_train_batch_size`: 16
|
144 |
+
- `per_device_eval_batch_size`: 16
|
145 |
+
- `multi_dataset_batch_sampler`: round_robin
|
146 |
+
|
147 |
+
#### All Hyperparameters
|
148 |
+
<details><summary>Click to expand</summary>
|
149 |
+
|
150 |
+
- `overwrite_output_dir`: False
|
151 |
+
- `do_predict`: False
|
152 |
+
- `eval_strategy`: no
|
153 |
+
- `prediction_loss_only`: True
|
154 |
+
- `per_device_train_batch_size`: 16
|
155 |
+
- `per_device_eval_batch_size`: 16
|
156 |
+
- `per_gpu_train_batch_size`: None
|
157 |
+
- `per_gpu_eval_batch_size`: None
|
158 |
+
- `gradient_accumulation_steps`: 1
|
159 |
+
- `eval_accumulation_steps`: None
|
160 |
+
- `torch_empty_cache_steps`: None
|
161 |
+
- `learning_rate`: 5e-05
|
162 |
+
- `weight_decay`: 0.0
|
163 |
+
- `adam_beta1`: 0.9
|
164 |
+
- `adam_beta2`: 0.999
|
165 |
+
- `adam_epsilon`: 1e-08
|
166 |
+
- `max_grad_norm`: 1
|
167 |
+
- `num_train_epochs`: 3
|
168 |
+
- `max_steps`: -1
|
169 |
+
- `lr_scheduler_type`: linear
|
170 |
+
- `lr_scheduler_kwargs`: {}
|
171 |
+
- `warmup_ratio`: 0.0
|
172 |
+
- `warmup_steps`: 0
|
173 |
+
- `log_level`: passive
|
174 |
+
- `log_level_replica`: warning
|
175 |
+
- `log_on_each_node`: True
|
176 |
+
- `logging_nan_inf_filter`: True
|
177 |
+
- `save_safetensors`: True
|
178 |
+
- `save_on_each_node`: False
|
179 |
+
- `save_only_model`: False
|
180 |
+
- `restore_callback_states_from_checkpoint`: False
|
181 |
+
- `no_cuda`: False
|
182 |
+
- `use_cpu`: False
|
183 |
+
- `use_mps_device`: False
|
184 |
+
- `seed`: 42
|
185 |
+
- `data_seed`: None
|
186 |
+
- `jit_mode_eval`: False
|
187 |
+
- `use_ipex`: False
|
188 |
+
- `bf16`: False
|
189 |
+
- `fp16`: False
|
190 |
+
- `fp16_opt_level`: O1
|
191 |
+
- `half_precision_backend`: auto
|
192 |
+
- `bf16_full_eval`: False
|
193 |
+
- `fp16_full_eval`: False
|
194 |
+
- `tf32`: None
|
195 |
+
- `local_rank`: 0
|
196 |
+
- `ddp_backend`: None
|
197 |
+
- `tpu_num_cores`: None
|
198 |
+
- `tpu_metrics_debug`: False
|
199 |
+
- `debug`: []
|
200 |
+
- `dataloader_drop_last`: False
|
201 |
+
- `dataloader_num_workers`: 0
|
202 |
+
- `dataloader_prefetch_factor`: None
|
203 |
+
- `past_index`: -1
|
204 |
+
- `disable_tqdm`: False
|
205 |
+
- `remove_unused_columns`: True
|
206 |
+
- `label_names`: None
|
207 |
+
- `load_best_model_at_end`: False
|
208 |
+
- `ignore_data_skip`: False
|
209 |
+
- `fsdp`: []
|
210 |
+
- `fsdp_min_num_params`: 0
|
211 |
+
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
|
212 |
+
- `fsdp_transformer_layer_cls_to_wrap`: None
|
213 |
+
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
|
214 |
+
- `deepspeed`: None
|
215 |
+
- `label_smoothing_factor`: 0.0
|
216 |
+
- `optim`: adamw_torch
|
217 |
+
- `optim_args`: None
|
218 |
+
- `adafactor`: False
|
219 |
+
- `group_by_length`: False
|
220 |
+
- `length_column_name`: length
|
221 |
+
- `ddp_find_unused_parameters`: None
|
222 |
+
- `ddp_bucket_cap_mb`: None
|
223 |
+
- `ddp_broadcast_buffers`: False
|
224 |
+
- `dataloader_pin_memory`: True
|
225 |
+
- `dataloader_persistent_workers`: False
|
226 |
+
- `skip_memory_metrics`: True
|
227 |
+
- `use_legacy_prediction_loop`: False
|
228 |
+
- `push_to_hub`: False
|
229 |
+
- `resume_from_checkpoint`: None
|
230 |
+
- `hub_model_id`: None
|
231 |
+
- `hub_strategy`: every_save
|
232 |
+
- `hub_private_repo`: None
|
233 |
+
- `hub_always_push`: False
|
234 |
+
- `gradient_checkpointing`: False
|
235 |
+
- `gradient_checkpointing_kwargs`: None
|
236 |
+
- `include_inputs_for_metrics`: False
|
237 |
+
- `include_for_metrics`: []
|
238 |
+
- `eval_do_concat_batches`: True
|
239 |
+
- `fp16_backend`: auto
|
240 |
+
- `push_to_hub_model_id`: None
|
241 |
+
- `push_to_hub_organization`: None
|
242 |
+
- `mp_parameters`:
|
243 |
+
- `auto_find_batch_size`: False
|
244 |
+
- `full_determinism`: False
|
245 |
+
- `torchdynamo`: None
|
246 |
+
- `ray_scope`: last
|
247 |
+
- `ddp_timeout`: 1800
|
248 |
+
- `torch_compile`: False
|
249 |
+
- `torch_compile_backend`: None
|
250 |
+
- `torch_compile_mode`: None
|
251 |
+
- `dispatch_batches`: None
|
252 |
+
- `split_batches`: None
|
253 |
+
- `include_tokens_per_second`: False
|
254 |
+
- `include_num_input_tokens_seen`: False
|
255 |
+
- `neftune_noise_alpha`: None
|
256 |
+
- `optim_target_modules`: None
|
257 |
+
- `batch_eval_metrics`: False
|
258 |
+
- `eval_on_start`: False
|
259 |
+
- `use_liger_kernel`: False
|
260 |
+
- `eval_use_gather_object`: False
|
261 |
+
- `average_tokens_across_devices`: False
|
262 |
+
- `prompts`: None
|
263 |
+
- `batch_sampler`: batch_sampler
|
264 |
+
- `multi_dataset_batch_sampler`: round_robin
|
265 |
+
|
266 |
+
</details>
|
267 |
+
|
268 |
+
### Framework Versions
|
269 |
+
- Python: 3.12.2
|
270 |
+
- Sentence Transformers: 3.3.1
|
271 |
+
- Transformers: 4.47.1
|
272 |
+
- PyTorch: 2.5.1+cpu
|
273 |
+
- Accelerate: 1.2.1
|
274 |
+
- Datasets: 3.2.0
|
275 |
+
- Tokenizers: 0.21.0
|
276 |
+
|
277 |
+
## Citation
|
278 |
+
|
279 |
+
### BibTeX
|
280 |
+
|
281 |
+
#### Sentence Transformers
|
282 |
+
```bibtex
|
283 |
+
@inproceedings{reimers-2019-sentence-bert,
|
284 |
+
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
|
285 |
+
author = "Reimers, Nils and Gurevych, Iryna",
|
286 |
+
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
|
287 |
+
month = "11",
|
288 |
+
year = "2019",
|
289 |
+
publisher = "Association for Computational Linguistics",
|
290 |
+
url = "https://arxiv.org/abs/1908.10084",
|
291 |
+
}
|
292 |
+
```
|
293 |
+
|
294 |
+
<!--
|
295 |
+
## Glossary
|
296 |
+
|
297 |
+
*Clearly define terms in order to be accessible across audiences.*
|
298 |
+
-->
|
299 |
+
|
300 |
+
<!--
|
301 |
+
## Model Card Authors
|
302 |
+
|
303 |
+
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
304 |
+
-->
|
305 |
+
|
306 |
+
<!--
|
307 |
+
## Model Card Contact
|
308 |
+
|
309 |
+
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
|
310 |
+
-->
|
config.json
ADDED
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
|
3 |
+
"architectures": [
|
4 |
+
"BertModel"
|
5 |
+
],
|
6 |
+
"attention_probs_dropout_prob": 0.1,
|
7 |
+
"classifier_dropout": null,
|
8 |
+
"gradient_checkpointing": false,
|
9 |
+
"hidden_act": "gelu",
|
10 |
+
"hidden_dropout_prob": 0.1,
|
11 |
+
"hidden_size": 384,
|
12 |
+
"initializer_range": 0.02,
|
13 |
+
"intermediate_size": 1536,
|
14 |
+
"layer_norm_eps": 1e-12,
|
15 |
+
"max_position_embeddings": 512,
|
16 |
+
"model_type": "bert",
|
17 |
+
"num_attention_heads": 12,
|
18 |
+
"num_hidden_layers": 12,
|
19 |
+
"pad_token_id": 0,
|
20 |
+
"position_embedding_type": "absolute",
|
21 |
+
"torch_dtype": "float32",
|
22 |
+
"transformers_version": "4.47.1",
|
23 |
+
"type_vocab_size": 2,
|
24 |
+
"use_cache": true,
|
25 |
+
"vocab_size": 250037
|
26 |
+
}
|
config_sentence_transformers.json
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"__version__": {
|
3 |
+
"sentence_transformers": "3.3.1",
|
4 |
+
"transformers": "4.47.1",
|
5 |
+
"pytorch": "2.5.1+cpu"
|
6 |
+
},
|
7 |
+
"prompts": {},
|
8 |
+
"default_prompt_name": null,
|
9 |
+
"similarity_fn_name": "cosine"
|
10 |
+
}
|
model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7f4f89d628f87ade0e0b57c40affb6402cd77abc8110584d8d35dc86da514ee8
|
3 |
+
size 470637416
|
modules.json
ADDED
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[
|
2 |
+
{
|
3 |
+
"idx": 0,
|
4 |
+
"name": "0",
|
5 |
+
"path": "",
|
6 |
+
"type": "sentence_transformers.models.Transformer"
|
7 |
+
},
|
8 |
+
{
|
9 |
+
"idx": 1,
|
10 |
+
"name": "1",
|
11 |
+
"path": "1_Pooling",
|
12 |
+
"type": "sentence_transformers.models.Pooling"
|
13 |
+
}
|
14 |
+
]
|
sentence_bert_config.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"max_seq_length": 128,
|
3 |
+
"do_lower_case": false
|
4 |
+
}
|
special_tokens_map.json
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": {
|
3 |
+
"content": "<s>",
|
4 |
+
"lstrip": false,
|
5 |
+
"normalized": false,
|
6 |
+
"rstrip": false,
|
7 |
+
"single_word": false
|
8 |
+
},
|
9 |
+
"cls_token": {
|
10 |
+
"content": "<s>",
|
11 |
+
"lstrip": false,
|
12 |
+
"normalized": false,
|
13 |
+
"rstrip": false,
|
14 |
+
"single_word": false
|
15 |
+
},
|
16 |
+
"eos_token": {
|
17 |
+
"content": "</s>",
|
18 |
+
"lstrip": false,
|
19 |
+
"normalized": false,
|
20 |
+
"rstrip": false,
|
21 |
+
"single_word": false
|
22 |
+
},
|
23 |
+
"mask_token": {
|
24 |
+
"content": "<mask>",
|
25 |
+
"lstrip": true,
|
26 |
+
"normalized": false,
|
27 |
+
"rstrip": false,
|
28 |
+
"single_word": false
|
29 |
+
},
|
30 |
+
"pad_token": {
|
31 |
+
"content": "<pad>",
|
32 |
+
"lstrip": false,
|
33 |
+
"normalized": false,
|
34 |
+
"rstrip": false,
|
35 |
+
"single_word": false
|
36 |
+
},
|
37 |
+
"sep_token": {
|
38 |
+
"content": "</s>",
|
39 |
+
"lstrip": false,
|
40 |
+
"normalized": false,
|
41 |
+
"rstrip": false,
|
42 |
+
"single_word": false
|
43 |
+
},
|
44 |
+
"unk_token": {
|
45 |
+
"content": "<unk>",
|
46 |
+
"lstrip": false,
|
47 |
+
"normalized": false,
|
48 |
+
"rstrip": false,
|
49 |
+
"single_word": false
|
50 |
+
}
|
51 |
+
}
|
tokenizer.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cad551d5600a84242d0973327029452a1e3672ba6313c2a3c3d69c4310e12719
|
3 |
+
size 17082987
|
tokenizer_config.json
ADDED
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"added_tokens_decoder": {
|
3 |
+
"0": {
|
4 |
+
"content": "<s>",
|
5 |
+
"lstrip": false,
|
6 |
+
"normalized": false,
|
7 |
+
"rstrip": false,
|
8 |
+
"single_word": false,
|
9 |
+
"special": true
|
10 |
+
},
|
11 |
+
"1": {
|
12 |
+
"content": "<pad>",
|
13 |
+
"lstrip": false,
|
14 |
+
"normalized": false,
|
15 |
+
"rstrip": false,
|
16 |
+
"single_word": false,
|
17 |
+
"special": true
|
18 |
+
},
|
19 |
+
"2": {
|
20 |
+
"content": "</s>",
|
21 |
+
"lstrip": false,
|
22 |
+
"normalized": false,
|
23 |
+
"rstrip": false,
|
24 |
+
"single_word": false,
|
25 |
+
"special": true
|
26 |
+
},
|
27 |
+
"3": {
|
28 |
+
"content": "<unk>",
|
29 |
+
"lstrip": false,
|
30 |
+
"normalized": false,
|
31 |
+
"rstrip": false,
|
32 |
+
"single_word": false,
|
33 |
+
"special": true
|
34 |
+
},
|
35 |
+
"250001": {
|
36 |
+
"content": "<mask>",
|
37 |
+
"lstrip": true,
|
38 |
+
"normalized": false,
|
39 |
+
"rstrip": false,
|
40 |
+
"single_word": false,
|
41 |
+
"special": true
|
42 |
+
}
|
43 |
+
},
|
44 |
+
"bos_token": "<s>",
|
45 |
+
"clean_up_tokenization_spaces": false,
|
46 |
+
"cls_token": "<s>",
|
47 |
+
"do_lower_case": true,
|
48 |
+
"eos_token": "</s>",
|
49 |
+
"extra_special_tokens": {},
|
50 |
+
"mask_token": "<mask>",
|
51 |
+
"max_length": 128,
|
52 |
+
"model_max_length": 128,
|
53 |
+
"pad_to_multiple_of": null,
|
54 |
+
"pad_token": "<pad>",
|
55 |
+
"pad_token_type_id": 0,
|
56 |
+
"padding_side": "right",
|
57 |
+
"sep_token": "</s>",
|
58 |
+
"stride": 0,
|
59 |
+
"strip_accents": null,
|
60 |
+
"tokenize_chinese_chars": true,
|
61 |
+
"tokenizer_class": "BertTokenizer",
|
62 |
+
"truncation_side": "right",
|
63 |
+
"truncation_strategy": "longest_first",
|
64 |
+
"unk_token": "<unk>"
|
65 |
+
}
|
unigram.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:da145b5e7700ae40f16691ec32a0b1fdc1ee3298db22a31ea55f57a966c4a65d
|
3 |
+
size 14763260
|