fragata commited on
Commit
9765fb9
·
1 Parent(s): ea22dca

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -1
README.md CHANGED
@@ -12,4 +12,29 @@ widget:
12
  - "Szép az autó."
13
  - "Elutazok egy napra."
14
  example_title: "Példa"
15
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  - "Szép az autó."
13
  - "Elutazok egy napra."
14
  example_title: "Példa"
15
+ ---
16
+
17
+ # Hungarian Experimental Sentence-BERT
18
+
19
+ The pre-trained hubert-base-cc[https://huggingface.co/SZTAKI-HLT/hubert-base-cc] was fine-tuned on the Hunglish 2.0[http://mokk.bme.hu/resources/hunglishcorpus/] parallel corpus to mimic the bert-base-nli-stsb-mean-tokens[https://huggingface.co/sentence-transformers/bert-base-nli-stsb-mean-tokens] model provided by UKPLab. Sentence embeddings were obtained by applying mean pooling to the huBERT output. The data was split into training (98%) and validation (2%) sets. By the end of the training, a mean squared error of 0.106 was computed on the validation set. Our code was based on the Sentence-Transformers[https://www.sbert.net] library. Our model was trained for 2 epochs on a single GTX 1080Ti GPU card with the batch size set to 32. The training took approximately 15 hours.
20
+
21
+ ## Limitations
22
+
23
+ - max_seq_length = 128
24
+
25
+ ## Usage
26
+
27
+ ## Citation
28
+ If you use this model, please cite the following paper:
29
+
30
+ ```
31
+ @article {bertopic,
32
+ title = {Analyzing Narratives of Patient Experiences: A BERT Topic Modeling Approach},
33
+ journal = {Acta Polytechnica Hungarica},
34
+ year = {2023},
35
+ author = {Osváth, Mátyás and Yang, Zijian Győző and Kósa, Karolina},
36
+ pages = {153--171},
37
+ volume = {20},
38
+ number = {7}
39
+ }
40
+ ```