daviddrzik
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,14 +1,14 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
language:
|
4 |
-
- sk
|
5 |
-
datasets:
|
6 |
-
- oscar-corpus/OSCAR-2109
|
7 |
-
pipeline_tag: fill-mask
|
8 |
-
library_name: transformers
|
9 |
-
tags:
|
10 |
-
- slovak-language-model
|
11 |
-
---
|
12 |
|
13 |
# Slovak BPE Baby Language Model (SK_BPE_BLM)
|
14 |
|
@@ -106,4 +106,26 @@ Here are the fine-tuned versions of the `SK_BPE_BLM` model based on the folders
|
|
106 |
- [`SK_BPE_BLM-sentiment-csfd`](https://huggingface.co/daviddrzik/SK_BPE_BLM-sentiment-csfd): Fine-tuned for sentiment analysis on the CSFD (movie review) dataset.
|
107 |
- [`SK_BPE_BLM-sentiment-multidomain`](https://huggingface.co/daviddrzik/SK_BPE_BLM-sentiment-multidomain): Fine-tuned for sentiment analysis across multiple domains.
|
108 |
- [`SK_BPE_BLM-sentiment-reviews`](https://huggingface.co/daviddrzik/SK_BPE_BLM-sentiment-reviews): Fine-tuned for sentiment analysis on general review datasets.
|
109 |
-
- [`SK_BPE_BLM-topic-news`](https://huggingface.co/daviddrzik/SK_BPE_BLM-topic-news): Fine-tuned for topic classification in news articles.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
language:
|
4 |
+
- sk
|
5 |
+
datasets:
|
6 |
+
- oscar-corpus/OSCAR-2109
|
7 |
+
pipeline_tag: fill-mask
|
8 |
+
library_name: transformers
|
9 |
+
tags:
|
10 |
+
- slovak-language-model
|
11 |
+
---
|
12 |
|
13 |
# Slovak BPE Baby Language Model (SK_BPE_BLM)
|
14 |
|
|
|
106 |
- [`SK_BPE_BLM-sentiment-csfd`](https://huggingface.co/daviddrzik/SK_BPE_BLM-sentiment-csfd): Fine-tuned for sentiment analysis on the CSFD (movie review) dataset.
|
107 |
- [`SK_BPE_BLM-sentiment-multidomain`](https://huggingface.co/daviddrzik/SK_BPE_BLM-sentiment-multidomain): Fine-tuned for sentiment analysis across multiple domains.
|
108 |
- [`SK_BPE_BLM-sentiment-reviews`](https://huggingface.co/daviddrzik/SK_BPE_BLM-sentiment-reviews): Fine-tuned for sentiment analysis on general review datasets.
|
109 |
+
- [`SK_BPE_BLM-topic-news`](https://huggingface.co/daviddrzik/SK_BPE_BLM-topic-news): Fine-tuned for topic classification in news articles.
|
110 |
+
|
111 |
+
## Citation
|
112 |
+
|
113 |
+
If you find our model or paper useful, please consider citing our work:
|
114 |
+
|
115 |
+
### Article:
|
116 |
+
Držík, D., & Forgac, F. (2024). Slovak morphological tokenizer using the Byte-Pair Encoding algorithm. PeerJ Computer Science, 10, e2465. https://doi.org/10.7717/peerj-cs.2465
|
117 |
+
|
118 |
+
### BibTeX Entry:
|
119 |
+
```bib
|
120 |
+
@article{drzik2024slovak,
|
121 |
+
title={Slovak morphological tokenizer using the Byte-Pair Encoding algorithm},
|
122 |
+
author={Držík, Dávid and Forgac, František},
|
123 |
+
journal={PeerJ Computer Science},
|
124 |
+
volume={10},
|
125 |
+
pages={e2465},
|
126 |
+
year={2024},
|
127 |
+
month={11},
|
128 |
+
issn={2376-5992},
|
129 |
+
doi={10.7717/peerj-cs.2465}
|
130 |
+
}
|
131 |
+
```
|