Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ widget:
|
|
16 |
|
17 |
# Hungarian Experimental Sentence-BERT
|
18 |
|
19 |
-
The pre-trained
|
20 |
|
21 |
## Limitations
|
22 |
|
@@ -24,6 +24,15 @@ The pre-trained hubert-base-cc[https://huggingface.co/SZTAKI-HLT/hubert-base-cc]
|
|
24 |
|
25 |
## Usage
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
## Citation
|
28 |
If you use this model, please cite the following paper:
|
29 |
|
|
|
16 |
|
17 |
# Hungarian Experimental Sentence-BERT
|
18 |
|
19 |
+
The pre-trained [huBERT](https://huggingface.co/SZTAKI-HLT/hubert-base-cc) was fine-tuned on the[ Hunglish 2.0](http://mokk.bme.hu/resources/hunglishcorpus) parallel corpus to mimic the [bert-base-nli-stsb-mean-tokens](https://huggingface.co/sentence-transformers/bert-base-nli-stsb-mean-tokens) model provided by UKPLab. Sentence embeddings were obtained by applying mean pooling to the huBERT output. The data was split into training (98%) and validation (2%) sets. By the end of the training, a mean squared error of 0.106 was computed on the validation set. Our code was based on the [Sentence-Transformers](https://www.sbert.net) library. Our model was trained for 2 epochs on a single GTX 1080Ti GPU card with the batch size set to 32. The training took approximately 15 hours.
|
20 |
|
21 |
## Limitations
|
22 |
|
|
|
24 |
|
25 |
## Usage
|
26 |
|
27 |
+
```python
|
28 |
+
from sentence_transformers import SentenceTransformer
|
29 |
+
sentences = ["This is an example sentence", "Each sentence is converted"]
|
30 |
+
|
31 |
+
model = SentenceTransformer('NYTK/sentence-transformers-experimental-hubert-hungarian')
|
32 |
+
embeddings = model.encode(sentences)
|
33 |
+
print(embeddings)
|
34 |
+
```
|
35 |
+
|
36 |
## Citation
|
37 |
If you use this model, please cite the following paper:
|
38 |
|