wwydmanski
commited on
Commit
•
cb09e22
1
Parent(s):
49570a9
Upload folder using huggingface_hub
Browse files- README.md +109 -144
- model.safetensors +1 -1
README.md
CHANGED
@@ -7,39 +7,43 @@ tags:
|
|
7 |
- sentence-similarity
|
8 |
- feature-extraction
|
9 |
- generated_from_trainer
|
10 |
-
- dataset_size:
|
11 |
- loss:MultipleNegativesRankingLoss
|
12 |
widget:
|
13 |
-
- source_sentence:
|
14 |
sentences:
|
15 |
-
- '
|
16 |
-
- '
|
17 |
-
|
18 |
-
-
|
|
|
|
|
19 |
sentences:
|
20 |
-
- '
|
21 |
-
- '
|
22 |
-
|
23 |
-
- '
|
24 |
-
- source_sentence:
|
25 |
sentences:
|
26 |
-
- '
|
27 |
-
|
28 |
-
- '
|
29 |
-
|
|
|
|
|
|
|
|
|
30 |
sentences:
|
31 |
-
- '
|
32 |
-
- '
|
33 |
-
|
34 |
-
- '
|
35 |
-
- source_sentence:
|
36 |
sentences:
|
37 |
-
- '
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
bakeri. '
|
42 |
-
- 'Mechanisms of Action and Toxicity of the Mycotoxin Alternariol: A Review. '
|
43 |
---
|
44 |
|
45 |
# SentenceTransformer based on allenai/specter2_base
|
@@ -92,9 +96,9 @@ from sentence_transformers import SentenceTransformer
|
|
92 |
model = SentenceTransformer("sentence_transformers_model_id")
|
93 |
# Run inference
|
94 |
sentences = [
|
95 |
-
'
|
96 |
-
'
|
97 |
-
'
|
98 |
]
|
99 |
embeddings = model.encode(sentences)
|
100 |
print(embeddings.shape)
|
@@ -149,19 +153,19 @@ You can finetune this model on your own dataset.
|
|
149 |
#### json
|
150 |
|
151 |
* Dataset: json
|
152 |
-
* Size:
|
153 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
154 |
* Approximate statistics based on the first 1000 samples:
|
155 |
| | anchor | positive | negative |
|
156 |
|:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
157 |
| type | string | string | string |
|
158 |
-
| details | <ul><li>min:
|
159 |
* Samples:
|
160 |
-
| anchor
|
161 |
-
|
162 |
-
| <code
|
163 |
-
| <code>
|
164 |
-
| <code>
|
165 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
166 |
```json
|
167 |
{
|
@@ -301,117 +305,78 @@ You can finetune this model on your own dataset.
|
|
301 |
</details>
|
302 |
|
303 |
### Training Logs
|
304 |
-
<details><summary>Click to expand</summary>
|
305 |
-
|
306 |
| Epoch | Step | Training Loss |
|
307 |
|:------:|:----:|:-------------:|
|
308 |
-
| 0.
|
309 |
-
| 0.
|
310 |
-
| 0.
|
311 |
-
| 0.
|
312 |
-
| 0.
|
313 |
-
| 0.
|
314 |
-
| 0.
|
315 |
-
| 0.
|
316 |
-
| 0.
|
317 |
-
| 0.
|
318 |
-
| 0.
|
319 |
-
| 0.
|
320 |
-
| 0.
|
321 |
-
| 0.
|
322 |
-
| 0.
|
323 |
-
| 0.
|
324 |
-
| 0.
|
325 |
-
| 0.
|
326 |
-
| 0.
|
327 |
-
| 0.
|
328 |
-
| 0.
|
329 |
-
| 0.
|
330 |
-
| 0.
|
331 |
-
| 0.
|
332 |
-
| 0.
|
333 |
-
| 0.
|
334 |
-
| 0.
|
335 |
-
| 0.
|
336 |
-
| 0.
|
337 |
-
| 0.
|
338 |
-
| 0.
|
339 |
-
| 0.
|
340 |
-
| 0.
|
341 |
-
| 0.
|
342 |
-
| 0.
|
343 |
-
| 0.
|
344 |
-
| 0.
|
345 |
-
| 0.
|
346 |
-
| 0.
|
347 |
-
| 0.
|
348 |
-
| 0.
|
349 |
-
| 0.
|
350 |
-
| 0.
|
351 |
-
| 0.
|
352 |
-
| 0.
|
353 |
-
| 0.
|
354 |
-
| 0.
|
355 |
-
| 0.
|
356 |
-
| 0.
|
357 |
-
| 0.
|
358 |
-
| 0.
|
359 |
-
| 0.
|
360 |
-
| 0.
|
361 |
-
| 0.
|
362 |
-
| 0.
|
363 |
-
| 0.
|
364 |
-
| 0.
|
365 |
-
| 0.
|
366 |
-
| 0.
|
367 |
-
| 0.
|
368 |
-
| 0.
|
369 |
-
| 0.
|
370 |
-
| 0.
|
371 |
-
| 0.
|
372 |
-
| 0.
|
373 |
-
| 0.
|
374 |
-
| 0.
|
375 |
-
| 0.
|
376 |
-
| 0
|
377 |
-
| 0.6667 | 70 | 0.069 |
|
378 |
-
| 0.6762 | 71 | 0.0243 |
|
379 |
-
| 0.6857 | 72 | 0.0517 |
|
380 |
-
| 0.6952 | 73 | 0.0332 |
|
381 |
-
| 0.7048 | 74 | 0.0662 |
|
382 |
-
| 0.7143 | 75 | 0.0753 |
|
383 |
-
| 0.7238 | 76 | 0.0914 |
|
384 |
-
| 0.7333 | 77 | 0.1094 |
|
385 |
-
| 0.7429 | 78 | 0.0557 |
|
386 |
-
| 0.7524 | 79 | 0.0436 |
|
387 |
-
| 0.7619 | 80 | 0.0137 |
|
388 |
-
| 0.7714 | 81 | 0.0399 |
|
389 |
-
| 0.7810 | 82 | 0.0278 |
|
390 |
-
| 0.7905 | 83 | 0.0438 |
|
391 |
-
| 0.8 | 84 | 0.1392 |
|
392 |
-
| 0.8095 | 85 | 0.0299 |
|
393 |
-
| 0.8190 | 86 | 0.0667 |
|
394 |
-
| 0.8286 | 87 | 0.0404 |
|
395 |
-
| 0.8381 | 88 | 0.0166 |
|
396 |
-
| 0.8476 | 89 | 0.1679 |
|
397 |
-
| 0.8571 | 90 | 0.0282 |
|
398 |
-
| 0.8667 | 91 | 0.0628 |
|
399 |
-
| 0.8762 | 92 | 0.0618 |
|
400 |
-
| 0.8857 | 93 | 0.0167 |
|
401 |
-
| 0.8952 | 94 | 0.2108 |
|
402 |
-
| 0.9048 | 95 | 0.0749 |
|
403 |
-
| 0.9143 | 96 | 0.0997 |
|
404 |
-
| 0.9238 | 97 | 0.0675 |
|
405 |
-
| 0.9333 | 98 | 0.0409 |
|
406 |
-
| 0.9429 | 99 | 0.0355 |
|
407 |
-
| 0.9524 | 100 | 0.1391 |
|
408 |
-
| 0.9619 | 101 | 0.0938 |
|
409 |
-
| 0.9714 | 102 | 0.0526 |
|
410 |
-
| 0.9810 | 103 | 0.0035 |
|
411 |
-
| 0.9905 | 104 | 0.0022 |
|
412 |
-
| 1.0 | 105 | 0.0016 |
|
413 |
|
414 |
-
</details>
|
415 |
|
416 |
### Framework Versions
|
417 |
- Python: 3.9.19
|
|
|
7 |
- sentence-similarity
|
8 |
- feature-extraction
|
9 |
- generated_from_trainer
|
10 |
+
- dataset_size:6574
|
11 |
- loss:MultipleNegativesRankingLoss
|
12 |
widget:
|
13 |
+
- source_sentence: sigma N protein interactions
|
14 |
sentences:
|
15 |
+
- 'Smoking Relapse After Lung Transplantation: Is a Second Transplant Justified? '
|
16 |
+
- 'Core RNA polymerase and promoter DNA interactions of purified domains of sigma
|
17 |
+
N: bipartite functions. '
|
18 |
+
- 'Protein-protein interactions mapped by artificial proteases: where sigma factors
|
19 |
+
bind to RNA polymerase. '
|
20 |
+
- source_sentence: Frailty pathway co-design
|
21 |
sentences:
|
22 |
+
- 'High-Sensitivity Cardiac Troponin I Levels in Normal and Hypertensive Pregnancy. '
|
23 |
+
- 'The systematic approach to improving care for Frail Older Patients (SAFE) study:
|
24 |
+
A protocol for co-designing a frail older person''s pathway. '
|
25 |
+
- 'Frailty: successful clinical practice implementation. '
|
26 |
+
- source_sentence: Diurnal lipid metabolism in lactating sheep
|
27 |
sentences:
|
28 |
+
- 'Interpreting and applying the EUFEST results using number needed to treat: antipsychotic
|
29 |
+
effectiveness in first-episode schizophrenia. '
|
30 |
+
- 'Diurnal variations in the concentration, arteriovenous difference, extraction
|
31 |
+
ratio, and uptake of 3-hydroxybutyrate and plasma free fatty acids in the hind
|
32 |
+
limb of lactating sheep. '
|
33 |
+
- 'Diurnal regulation of milk lipid production and milk secretion in the rat: effect
|
34 |
+
of dietary protein and energy restriction. '
|
35 |
+
- source_sentence: Ectopic gastric mucosa
|
36 |
sentences:
|
37 |
+
- '[Ectopic cardia and gastroesophageal reflux]. '
|
38 |
+
- 'A bacterial toxicity assay performed with microplates, microluminometry and Microtox
|
39 |
+
reagent. '
|
40 |
+
- 'Gastric polyp. '
|
41 |
+
- source_sentence: monograph editing
|
42 |
sentences:
|
43 |
+
- 'Monographs editor. '
|
44 |
+
- 'Maternal stress and high-fat diet effect on maternal behavior, milk composition,
|
45 |
+
and pup ingestive behavior. '
|
46 |
+
- 'The editing life. '
|
|
|
|
|
47 |
---
|
48 |
|
49 |
# SentenceTransformer based on allenai/specter2_base
|
|
|
96 |
model = SentenceTransformer("sentence_transformers_model_id")
|
97 |
# Run inference
|
98 |
sentences = [
|
99 |
+
'monograph editing',
|
100 |
+
'Monographs editor. ',
|
101 |
+
'The editing life. ',
|
102 |
]
|
103 |
embeddings = model.encode(sentences)
|
104 |
print(embeddings.shape)
|
|
|
153 |
#### json
|
154 |
|
155 |
* Dataset: json
|
156 |
+
* Size: 6,574 training samples
|
157 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
158 |
* Approximate statistics based on the first 1000 samples:
|
159 |
| | anchor | positive | negative |
|
160 |
|:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
161 |
| type | string | string | string |
|
162 |
+
| details | <ul><li>min: 3 tokens</li><li>mean: 7.59 tokens</li><li>max: 33 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 19.89 tokens</li><li>max: 70 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 11.97 tokens</li><li>max: 50 tokens</li></ul> |
|
163 |
* Samples:
|
164 |
+
| anchor | positive | negative |
|
165 |
+
|:-------------------------------------------------|:--------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------|
|
166 |
+
| <code>α-Alumina Nanoparticle Grafting</code> | <code>Grafting PMMA Brushes from α-Alumina Nanoparticles via SI-ATRP. </code> | <code>Mesoporous alumina from colloidal biotemplating of Al clusters. </code> |
|
167 |
+
| <code>Congenital candidiasis septic shock</code> | <code>Congenital candidiasis presenting as septic shock without rash. </code> | <code>Congenital cutaneous candidiasis: clinical presentation, pathogenesis, and management guidelines. </code> |
|
168 |
+
| <code>Chronic Venous Occlusion</code> | <code>Anatomic response of canine hindlimb vasculature to chronic venous occlusion. </code> | <code>Chronic venous insufficiency. </code> |
|
169 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
170 |
```json
|
171 |
{
|
|
|
305 |
</details>
|
306 |
|
307 |
### Training Logs
|
|
|
|
|
308 |
| Epoch | Step | Training Loss |
|
309 |
|:------:|:----:|:-------------:|
|
310 |
+
| 0.0145 | 1 | 2.8777 |
|
311 |
+
| 0.0290 | 2 | 2.8723 |
|
312 |
+
| 0.0435 | 3 | 2.7432 |
|
313 |
+
| 0.0580 | 4 | 2.8806 |
|
314 |
+
| 0.0725 | 5 | 2.3007 |
|
315 |
+
| 0.0870 | 6 | 2.2423 |
|
316 |
+
| 0.1014 | 7 | 1.995 |
|
317 |
+
| 0.1159 | 8 | 1.5115 |
|
318 |
+
| 0.1304 | 9 | 1.41 |
|
319 |
+
| 0.1449 | 10 | 1.243 |
|
320 |
+
| 0.1594 | 11 | 1.1634 |
|
321 |
+
| 0.1739 | 12 | 1.1996 |
|
322 |
+
| 0.1884 | 13 | 1.3653 |
|
323 |
+
| 0.2029 | 14 | 1.5704 |
|
324 |
+
| 0.2174 | 15 | 1.3556 |
|
325 |
+
| 0.2319 | 16 | 1.4051 |
|
326 |
+
| 0.2464 | 17 | 1.0999 |
|
327 |
+
| 0.2609 | 18 | 1.0826 |
|
328 |
+
| 0.2754 | 19 | 1.0449 |
|
329 |
+
| 0.2899 | 20 | 1.0517 |
|
330 |
+
| 0.3043 | 21 | 0.9716 |
|
331 |
+
| 0.3188 | 22 | 1.1993 |
|
332 |
+
| 0.3333 | 23 | 1.1375 |
|
333 |
+
| 0.3478 | 24 | 0.9875 |
|
334 |
+
| 0.3623 | 25 | 0.7656 |
|
335 |
+
| 0.3768 | 26 | 1.2773 |
|
336 |
+
| 0.3913 | 27 | 0.7802 |
|
337 |
+
| 0.4058 | 28 | 0.882 |
|
338 |
+
| 0.4203 | 29 | 1.0534 |
|
339 |
+
| 0.4348 | 30 | 0.9073 |
|
340 |
+
| 0.4493 | 31 | 0.916 |
|
341 |
+
| 0.4638 | 32 | 0.9702 |
|
342 |
+
| 0.4783 | 33 | 1.2868 |
|
343 |
+
| 0.4928 | 34 | 1.0854 |
|
344 |
+
| 0.5072 | 35 | 0.8832 |
|
345 |
+
| 0.5217 | 36 | 0.9139 |
|
346 |
+
| 0.5362 | 37 | 0.9032 |
|
347 |
+
| 0.5507 | 38 | 0.965 |
|
348 |
+
| 0.5652 | 39 | 0.7222 |
|
349 |
+
| 0.5797 | 40 | 0.6682 |
|
350 |
+
| 0.5942 | 41 | 0.8562 |
|
351 |
+
| 0.6087 | 42 | 0.9248 |
|
352 |
+
| 0.6232 | 43 | 0.9867 |
|
353 |
+
| 0.6377 | 44 | 0.7328 |
|
354 |
+
| 0.6522 | 45 | 0.7506 |
|
355 |
+
| 0.6667 | 46 | 0.7952 |
|
356 |
+
| 0.6812 | 47 | 0.7979 |
|
357 |
+
| 0.6957 | 48 | 1.0043 |
|
358 |
+
| 0.7101 | 49 | 1.0428 |
|
359 |
+
| 0.7246 | 50 | 0.8772 |
|
360 |
+
| 0.7391 | 51 | 0.6598 |
|
361 |
+
| 0.7536 | 52 | 0.7804 |
|
362 |
+
| 0.7681 | 53 | 0.599 |
|
363 |
+
| 0.7826 | 54 | 0.7974 |
|
364 |
+
| 0.7971 | 55 | 0.7489 |
|
365 |
+
| 0.8116 | 56 | 0.8701 |
|
366 |
+
| 0.8261 | 57 | 0.8903 |
|
367 |
+
| 0.8406 | 58 | 0.7223 |
|
368 |
+
| 0.8551 | 59 | 0.925 |
|
369 |
+
| 0.8696 | 60 | 1.0247 |
|
370 |
+
| 0.8841 | 61 | 0.7531 |
|
371 |
+
| 0.8986 | 62 | 0.9684 |
|
372 |
+
| 0.9130 | 63 | 0.7462 |
|
373 |
+
| 0.9275 | 64 | 0.8555 |
|
374 |
+
| 0.9420 | 65 | 0.8016 |
|
375 |
+
| 0.9565 | 66 | 0.7603 |
|
376 |
+
| 0.9710 | 67 | 1.1052 |
|
377 |
+
| 0.9855 | 68 | 0.9505 |
|
378 |
+
| 1.0 | 69 | 0.6259 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
379 |
|
|
|
380 |
|
381 |
### Framework Versions
|
382 |
- Python: 3.9.19
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 439696224
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8bde86f785555d47618677bc7c74848231a3556a1eb547e6ded8a24d9917051b
|
3 |
size 439696224
|