d1mitriz commited on
Commit
6e633bd
1 Parent(s): a387767

added proper citation to readme

Browse files
Files changed (1) hide show
  1. README.md +49 -28
README.md CHANGED
@@ -13,26 +13,27 @@ metrics:
13
  - accuracy_manhattan
14
  model-index:
15
  - name: st-greek-media-bert-base-uncased
16
- results: [
17
- {
18
- "task": {
19
- "name": "STS Benchmark",
20
- "type": "sentence-similarity"
 
 
 
 
 
 
 
 
 
 
21
  },
22
- "metrics": [
23
- { "type": "accuracy_cosinus", "value": 0.9563965089445283 },
24
- { "type": "accuracy_euclidean", "value": 0.9566394253292384 },
25
- { "type": "accuracy_manhattan", "value": 0.9565353183072198 }
26
- ],
27
- "dataset": {
28
- "name": "all_custom_greek_media_triplets",
29
- "type": "sentence-pair"
30
- },
31
- }
32
- ]
33
  ---
34
 
35
  # Greek Media SBERT (uncased)
 
36
  ## Sentence Transformer
37
 
38
  This is a [sentence-transformers](https://www.SBERT.net) based on the [Greek Media BERT (uncased)](https://huggingface.co/dimitriz/greek-media-bert-base-uncased) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
@@ -103,8 +104,8 @@ print(sentence_embeddings)
103
 
104
  <!--- Describe how your model was evaluated -->
105
 
106
- For an automated evaluation of this model, see the *Sentence Embeddings
107
- Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=dimitriz/st-greek-media-bert-base-uncased)
108
 
109
  ## Training
110
 
@@ -132,9 +133,9 @@ The model was trained with the parameters:
132
 
133
  `sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
134
 
135
- ```
136
- {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
137
- ```
138
 
139
  Parameters of the fit()-Method:
140
 
@@ -159,17 +160,37 @@ Parameters of the fit()-Method:
159
 
160
  ```
161
  SentenceTransformer(
162
- (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
163
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
164
  )
165
  ```
166
 
167
  ## Citing & Authors
168
 
169
- ```@inproceedings{...,
170
- title={DACL},
171
- author={Zaikis et al.},
172
- booktitle={...},
173
- year={2023}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
174
  }
175
- ```
 
 
 
13
  - accuracy_manhattan
14
  model-index:
15
  - name: st-greek-media-bert-base-uncased
16
+ results:
17
+ [
18
+ {
19
+ "task": { "name": "STS Benchmark", "type": "sentence-similarity" },
20
+ "metrics":
21
+ [
22
+ { "type": "accuracy_cosinus", "value": 0.9563965089445283 },
23
+ { "type": "accuracy_euclidean", "value": 0.9566394253292384 },
24
+ { "type": "accuracy_manhattan", "value": 0.9565353183072198 },
25
+ ],
26
+ "dataset":
27
+ {
28
+ "name": "all_custom_greek_media_triplets",
29
+ "type": "sentence-pair",
30
+ },
31
  },
32
+ ]
 
 
 
 
 
 
 
 
 
 
33
  ---
34
 
35
  # Greek Media SBERT (uncased)
36
+
37
  ## Sentence Transformer
38
 
39
  This is a [sentence-transformers](https://www.SBERT.net) based on the [Greek Media BERT (uncased)](https://huggingface.co/dimitriz/greek-media-bert-base-uncased) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
 
104
 
105
  <!--- Describe how your model was evaluated -->
106
 
107
+ For an automated evaluation of this model, see the _Sentence Embeddings
108
+ Benchmark_: [https://seb.sbert.net](https://seb.sbert.net?model_name=dimitriz/st-greek-media-bert-base-uncased)
109
 
110
  ## Training
111
 
 
133
 
134
  `sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
135
 
136
+ ```
137
+ {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
138
+ ```
139
 
140
  Parameters of the fit()-Method:
141
 
 
160
 
161
  ```
162
  SentenceTransformer(
163
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
164
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
165
  )
166
  ```
167
 
168
  ## Citing & Authors
169
 
170
+ The model has been officially released with the article "DACL: A Domain-Adapted Contrastive Learning Approach to Low Resource Language Representations for Document Clustering Tasks".
171
+ Dimitrios Zaikis, Stylianos Kokkas and Ioannis Vlahavas.
172
+ In: Iliadis, L., Maglogiannis, I., Alonso, S., Jayne, C., Pimenidis, E. (eds) Engineering Applications of Neural Networks. EANN 2023. Communications in Computer and Information Science, vol 1826. Springer, Cham".
173
+
174
+ If you use the model, please cite the following:
175
+
176
+ ```bibtex
177
+ @InProceedings{10.1007/978-3-031-34204-2_47,
178
+ author="Zaikis, Dimitrios
179
+ and Kokkas, Stylianos
180
+ and Vlahavas, Ioannis",
181
+ editor="Iliadis, Lazaros
182
+ and Maglogiannis, Ilias
183
+ and Alonso, Serafin
184
+ and Jayne, Chrisina
185
+ and Pimenidis, Elias",
186
+ title="DACL: A Domain-Adapted Contrastive Learning Approach to Low Resource Language Representations for Document Clustering Tasks",
187
+ booktitle="Engineering Applications of Neural Networks",
188
+ year="2023",
189
+ publisher="Springer Nature Switzerland",
190
+ address="Cham",
191
+ pages="585--598",
192
+ isbn="978-3-031-34204-2"
193
  }
194
+
195
+
196
+ ```