NYTK
/

sentence-transformers-experimental-hubert-hungarian

Sentence Similarity

sentence-transformers

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

fragata commited on Jul 11, 2023

Commit

4bc0656

·

1 Parent(s): 9765fb9

Update README.md

Files changed (1) hide show

README.md +10 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ widget:
 # Hungarian Experimental Sentence-BERT
-The pre-trained hubert-base-cc[https://huggingface.co/SZTAKI-HLT/hubert-base-cc] was fine-tuned on the Hunglish 2.0[http://mokk.bme.hu/resources/hunglishcorpus/] parallel corpus to mimic the bert-base-nli-stsb-mean-tokens[https://huggingface.co/sentence-transformers/bert-base-nli-stsb-mean-tokens] model provided by UKPLab. Sentence embeddings were obtained by applying mean pooling to the huBERT output. The data was split into training (98%) and validation (2%) sets. By the end of the training, a mean squared error of 0.106 was computed on the validation set. Our code was based on the Sentence-Transformers[https://www.sbert.net] library. Our model was trained for 2 epochs on a single GTX 1080Ti GPU card with the batch size set to 32. The training took approximately 15 hours.
 ## Limitations
@@ -24,6 +24,15 @@ The pre-trained hubert-base-cc[https://huggingface.co/SZTAKI-HLT/hubert-base-cc]
 ## Usage
 ## Citation
 If you use this model, please cite the following paper:

 # Hungarian Experimental Sentence-BERT
+The pre-trained [huBERT](https://huggingface.co/SZTAKI-HLT/hubert-base-cc) was fine-tuned on the[ Hunglish 2.0](http://mokk.bme.hu/resources/hunglishcorpus) parallel corpus to mimic the [bert-base-nli-stsb-mean-tokens](https://huggingface.co/sentence-transformers/bert-base-nli-stsb-mean-tokens) model provided by UKPLab. Sentence embeddings were obtained by applying mean pooling to the huBERT output. The data was split into training (98%) and validation (2%) sets. By the end of the training, a mean squared error of 0.106 was computed on the validation set. Our code was based on the [Sentence-Transformers](https://www.sbert.net) library. Our model was trained for 2 epochs on a single GTX 1080Ti GPU card with the batch size set to 32. The training took approximately 15 hours.
 ## Limitations
 ## Usage
+```python
+from sentence_transformers import SentenceTransformer
+sentences = ["This is an example sentence", "Each sentence is converted"]
+model = SentenceTransformer('NYTK/sentence-transformers-experimental-hubert-hungarian')
+embeddings = model.encode(sentences)
+print(embeddings)
+```
 ## Citation
 If you use this model, please cite the following paper: