Mayhem50 commited on
Commit
8348ff0
·
1 Parent(s): 9048053

Upload 10 files

Browse files
1_Pooling/config.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
  "word_embedding_dimension": 1024,
3
  "pooling_mode_cls_token": false,
4
- "pooling_mode_mean_tokens": false,
5
  "pooling_mode_max_tokens": false,
6
  "pooling_mode_mean_sqrt_len_tokens": false,
7
- "pooling_mode_weightedmean_tokens": true,
8
  "pooling_mode_lasttoken": false
9
  }
 
1
  {
2
  "word_embedding_dimension": 1024,
3
  "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
  "pooling_mode_max_tokens": false,
6
  "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
  "pooling_mode_lasttoken": false
9
  }
README.md CHANGED
@@ -4,6 +4,7 @@ tags:
4
  - sentence-transformers
5
  - feature-extraction
6
  - sentence-similarity
 
7
  ---
8
 
9
  # {MODEL_NAME}
@@ -33,6 +34,44 @@ print(embeddings)
33
 
34
 
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  ## Evaluation Results
37
 
38
  <!--- Describe how your model was evaluated -->
@@ -45,29 +84,31 @@ The model was trained with the parameters:
45
 
46
  **DataLoader**:
47
 
48
- `sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader` of length 905 with parameters:
49
  ```
50
- {'batch_size': 624}
51
  ```
52
 
53
  **Loss**:
54
 
55
- `sentence_transformers.losses.MultipleNegativesRankingLoss.MNRLGradCache`
56
 
57
  Parameters of the fit()-Method:
58
  ```
59
  {
60
  "epochs": 1,
61
- "evaluation_steps": 90,
62
- "evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
63
  "max_grad_norm": 1,
64
  "optimizer_class": "<class 'transformers.optimization.AdamW'>",
65
  "optimizer_params": {
66
- "lr": 0.00032
 
 
67
  },
68
  "scheduler": "WarmupLinear",
69
  "steps_per_epoch": null,
70
- "warmup_steps": 91,
71
  "weight_decay": 0.01
72
  }
73
  ```
@@ -77,7 +118,7 @@ Parameters of the fit()-Method:
77
  ```
78
  SentenceTransformer(
79
  (0): Transformer({'max_seq_length': 150, 'do_lower_case': False}) with Transformer model: BloomModel
80
- (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': True, 'pooling_mode_lasttoken': False})
81
  )
82
  ```
83
 
 
4
  - sentence-transformers
5
  - feature-extraction
6
  - sentence-similarity
7
+ - transformers
8
  ---
9
 
10
  # {MODEL_NAME}
 
34
 
35
 
36
 
37
+ ## Usage (HuggingFace Transformers)
38
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
39
+
40
+ ```python
41
+ from transformers import AutoTokenizer, AutoModel
42
+ import torch
43
+
44
+
45
+ #Mean Pooling - Take attention mask into account for correct averaging
46
+ def mean_pooling(model_output, attention_mask):
47
+ token_embeddings = model_output[0] #First element of model_output contains all token embeddings
48
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
49
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
50
+
51
+
52
+ # Sentences we want sentence embeddings for
53
+ sentences = ['This is an example sentence', 'Each sentence is converted']
54
+
55
+ # Load model from HuggingFace Hub
56
+ tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
57
+ model = AutoModel.from_pretrained('{MODEL_NAME}')
58
+
59
+ # Tokenize sentences
60
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
61
+
62
+ # Compute token embeddings
63
+ with torch.no_grad():
64
+ model_output = model(**encoded_input)
65
+
66
+ # Perform pooling. In this case, mean pooling.
67
+ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
68
+
69
+ print("Sentence embeddings:")
70
+ print(sentence_embeddings)
71
+ ```
72
+
73
+
74
+
75
  ## Evaluation Results
76
 
77
  <!--- Describe how your model was evaluated -->
 
84
 
85
  **DataLoader**:
86
 
87
+ `torch.utils.data.dataloader.DataLoader` of length 3076 with parameters:
88
  ```
89
+ {'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
90
  ```
91
 
92
  **Loss**:
93
 
94
+ `sentence_transformers.losses.MSELoss.MSELoss`
95
 
96
  Parameters of the fit()-Method:
97
  ```
98
  {
99
  "epochs": 1,
100
+ "evaluation_steps": 500,
101
+ "evaluator": "sentence_transformers.evaluation.SequentialEvaluator.SequentialEvaluator",
102
  "max_grad_norm": 1,
103
  "optimizer_class": "<class 'transformers.optimization.AdamW'>",
104
  "optimizer_params": {
105
+ "correct_bias": false,
106
+ "eps": 1e-06,
107
+ "lr": 2e-05
108
  },
109
  "scheduler": "WarmupLinear",
110
  "steps_per_epoch": null,
111
+ "warmup_steps": 1000,
112
  "weight_decay": 0.01
113
  }
114
  ```
 
118
  ```
119
  SentenceTransformer(
120
  (0): Transformer({'max_seq_length': 150, 'do_lower_case': False}) with Transformer model: BloomModel
121
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
122
  )
123
  ```
124
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "bigscience/bloom-560m",
3
  "apply_residual_connection_post_layernorm": false,
4
  "architectures": [
5
  "BloomModel"
 
1
  {
2
+ "_name_or_path": "Mayhem50/sgpt-bloom-560M-nli-v3",
3
  "apply_residual_connection_post_layernorm": false,
4
  "architectures": [
5
  "BloomModel"
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:140ef9bcc4f8c817983d623387f36750ec34935d61d2b2e053a65cfbb2ad0094
3
  size 2236953889
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ed607694a81d990d4e8e5a98818a12945cfc1c8d5b60e9b3656c372a0c18b6d
3
  size 2236953889
tokenizer_config.json CHANGED
@@ -3,7 +3,7 @@
3
  "bos_token": "<s>",
4
  "eos_token": "</s>",
5
  "model_max_length": 1000000000000000019884624838656,
6
- "name_or_path": "bigscience/bloom-560m",
7
  "pad_token": "<pad>",
8
  "padding_side": "left",
9
  "special_tokens_map_file": null,
 
3
  "bos_token": "<s>",
4
  "eos_token": "</s>",
5
  "model_max_length": 1000000000000000019884624838656,
6
+ "name_or_path": "Mayhem50/sgpt-bloom-560M-nli-v3",
7
  "pad_token": "<pad>",
8
  "padding_side": "left",
9
  "special_tokens_map_file": null,