uvegesistvan
commited on
Commit
•
2f712d1
1
Parent(s):
1d10483
Update README.md
Browse files
README.md
CHANGED
@@ -30,9 +30,7 @@ Cased fine-tuned BERT model for Hungarian, trained on a dataset provided by Nati
|
|
30 |
Refined version of the huBERTPlain ('uvegesistvan/huBERTPlain') model.
|
31 |
Trainig data cleaned further:
|
32 |
* Minor corrections in sentence segmentation results.
|
33 |
-
* Train data filtered: sentence pairs (original - rephrased) filtered out in each document,
|
34 |
-
* where two sentences' Levenstein distance was less then 3. These assumed to be spelling corrections,
|
35 |
-
* therefore potentially less helpful for Plain Language classification.
|
36 |
|
37 |
## Intended uses & limitations
|
38 |
|
|
|
30 |
Refined version of the huBERTPlain ('uvegesistvan/huBERTPlain') model.
|
31 |
Trainig data cleaned further:
|
32 |
* Minor corrections in sentence segmentation results.
|
33 |
+
* Train data filtered: sentence pairs (original - rephrased) filtered out in each document, where two sentences' Levenstein distance was less then 3. These assumed to be spelling corrections, therefore potentially less helpful for Plain Language classification.
|
|
|
|
|
34 |
|
35 |
## Intended uses & limitations
|
36 |
|