fragata commited on
Commit
4bc0656
·
1 Parent(s): 9765fb9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -16,7 +16,7 @@ widget:
16
 
17
  # Hungarian Experimental Sentence-BERT
18
 
19
- The pre-trained hubert-base-cc[https://huggingface.co/SZTAKI-HLT/hubert-base-cc] was fine-tuned on the Hunglish 2.0[http://mokk.bme.hu/resources/hunglishcorpus/] parallel corpus to mimic the bert-base-nli-stsb-mean-tokens[https://huggingface.co/sentence-transformers/bert-base-nli-stsb-mean-tokens] model provided by UKPLab. Sentence embeddings were obtained by applying mean pooling to the huBERT output. The data was split into training (98%) and validation (2%) sets. By the end of the training, a mean squared error of 0.106 was computed on the validation set. Our code was based on the Sentence-Transformers[https://www.sbert.net] library. Our model was trained for 2 epochs on a single GTX 1080Ti GPU card with the batch size set to 32. The training took approximately 15 hours.
20
 
21
  ## Limitations
22
 
@@ -24,6 +24,15 @@ The pre-trained hubert-base-cc[https://huggingface.co/SZTAKI-HLT/hubert-base-cc]
24
 
25
  ## Usage
26
 
 
 
 
 
 
 
 
 
 
27
  ## Citation
28
  If you use this model, please cite the following paper:
29
 
 
16
 
17
  # Hungarian Experimental Sentence-BERT
18
 
19
+ The pre-trained [huBERT](https://huggingface.co/SZTAKI-HLT/hubert-base-cc) was fine-tuned on the[ Hunglish 2.0](http://mokk.bme.hu/resources/hunglishcorpus) parallel corpus to mimic the [bert-base-nli-stsb-mean-tokens](https://huggingface.co/sentence-transformers/bert-base-nli-stsb-mean-tokens) model provided by UKPLab. Sentence embeddings were obtained by applying mean pooling to the huBERT output. The data was split into training (98%) and validation (2%) sets. By the end of the training, a mean squared error of 0.106 was computed on the validation set. Our code was based on the [Sentence-Transformers](https://www.sbert.net) library. Our model was trained for 2 epochs on a single GTX 1080Ti GPU card with the batch size set to 32. The training took approximately 15 hours.
20
 
21
  ## Limitations
22
 
 
24
 
25
  ## Usage
26
 
27
+ ```python
28
+ from sentence_transformers import SentenceTransformer
29
+ sentences = ["This is an example sentence", "Each sentence is converted"]
30
+
31
+ model = SentenceTransformer('NYTK/sentence-transformers-experimental-hubert-hungarian')
32
+ embeddings = model.encode(sentences)
33
+ print(embeddings)
34
+ ```
35
+
36
  ## Citation
37
  If you use this model, please cite the following paper:
38